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ABSTRACT 

Twenty-eight mature women were recruited from t-he 
community and trained in a 21 category observation code of family 
interaction, observers were assigned randomly ro three experimental 
groups and given different expectancy rationales about the outcomes 
of the studies for which they would be collecting data. All groups 
were told they would be observing a family under a father- pre sent and 
father-absent condition. One group was led to expect an increase^ 
another a decrease, and a third no change in the rate of deviant 
behavior for the boys in tiie family as conditions changed from 
father-present to f ather-absent. None of the groups were told they 
would be observing identical videotape recordings of family 
interaction permitting comparison of observation data across groups. 
Results indicated that the expectations of experimental outcomes 
differed significantly for the three groups. However, observers were 
totally unbiased in their reports of deviant behavior in group 
comparisons^ Failure to obtain evidence for observer bias in spite of 
the demonstrated manipulation of observer expectations was attributed 
to the precautions taken to assure high levels of observer accuracy 
throughout the study. (Author) 
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Twenty-eighr mature, women were recruited from the community and 
trained in a twenty-one category observation code of family interaction. 
Observers were assigned randomly to three experimental groups and given 
different expectancy rationales about the outcomes of the studies for 
which they would be collecting data. All groups were told they would be 
observing a family under a father-present and father-absent condition. 
One group was led to expect an increase, another a decrease, and a third 
no change in the rare of deviant behavior for the boys in the family as 
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Results indicated that the expectations of experimental outcomes 
differed significantly for the three groups. Hov/ever, observers were 
totally unbiased in their reports of deviant behavior in group comparisons. 
Failure to obtain evidence for observer bias in spite of the demonstrated 
manipulation of observer expectations was attributed to the precautions 
taken to assure high levels of observer accuracy throughout the study. 
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CHAPTER I 
INTRODUCTION 

Two kinds of expectancy effects in Dehavioral research are poten- 
tially damaging to the results obtained. One type affects the actual 
response of the subject of the experiment and the ot..er the data collec- 
tion process (Rosenthal, 1969, p. 182). The latter is a particular 
problem where human observers rather than automated methods of data col- 
lection are employed. This chapter will review the methodological problems 
presented by these expectancy effects together with relevant studies. 
The design for the present investigation' of observer bias can be found 
at the end of the chapter. 

Whi le meaningful research could hardly be conducted without hypotheses 
regardi ng outcomes , the experimenter's expectancies have the potential for 
subtly confounding the results. Intentional or unintentional communica- 
tion of the experimenter's expectancies differentially affect subject or 
observer responses as a function of the subject's treatment condition. 
Furthermore, while the research design, procedures, and interpretation of 
the data are public matters, the effect of the experimenter's expectancies 
'upon the subject's or observe^^'s behavior is no' open to public scrutiny 
and may occur without the experimenter's awareness. Not even independent 
replication of experimental results guarantees control against such 
expectancy effects (Rosenthal, 19^9, pp. 195-196). 

Expectancy effects should be of particular concern to investigators 



conducting evaluations of treatment outcomes of child behavior therapy. 
Pawlicki (197C) cites the lack of control groups and lack of controls for 
observer bias as the two most frequent methodological deficiencies in his 
review of research on child behavior ti.erapy. It is appropriate that most 
of the studies reviewed below have been drawn from cortemporary research 
In behavior modification. 

Expectancy Effects upon the Subject's Behavior 

Clever Hans, the horse belonging to Mr. von Osten, a German mathe- 
matics teachc:* , illustrates the subtle communication of expectancies to 
the subject of an experiment. Clever Hans could add, subtract, multiply, 
and divide by tapping with his hoof the answers to problems presented by 
his master and others. His master was unaware of cuing the horse :n any 
way, although careful evaluation by Pfungst (1911) revealed that when the 
horse could not see his questionaer he ceased to be clever. When he 
arrived at the correct number of taps, the horse was cued by a nodding of 
the questioner's head. 

Such expectancies may be directly communicated to the subject by the 
experimenter as in the case of Clever Hans. However, cultural expectancies 
may affect the subject's behavior independently of the experimenter's 
expectancies. Hathaway (19^8) has argued persuasively for the cultural 
pressures on patients to appear "sick" upon entering and "well" upon 
leaving therapy. Such effects created by the apparent expectancies of 
the situation are referred to as "demand characteristics" by Orne (I969, 
pp. H7~1^8). Since cultural expectancies frequently converge with those 
of the experimenter, especially where the study is of therapy outcomes. 
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no clear distinction will be made between demand characteristics anH 
experimenter expectancy effects uoon the subject's behavior in this 
chapter. 

Research on the demand characteristics involved in naturalistic obser- 
vation has been conducted by several behavior modifiers. Johnson and 
Lobitz (1972) provide convincing evidence that it is possible for parents 
of ^'normal" children to "fake" good and bad child management during home 
observations. Twelve sets of parents of preschool children were asked to 
do everything in 'cheir power to make their children appear "good" on 
three days of a six-day home observation period and "bad" on the remaining 
three days. Parents alternated from "good" to "bad" days in a counter- 
balanced design. Rates of deviant behavior, parental conmands, and 
"negative responses" by the paients consistently and significantly differe 
from "g- " to "bad" days across families. 

If parents of "normal" children can potentially "fake" the data 
according to the demands of an experimental situation, is it possible that 
the treatment effects reported for the families of deviant chi Idren under- 
going behavior therapy are due merely to ^'f akeab 1 1 i ty" according to the 
demands of the situation? A placebo study by Walter and Gilmore (1972) 
suggests that the effects of behavic-^-^l intervention in deviant families 
cannot be accounted for by the demand char c?*-er i st ics of the treatment 
situation or observer expectancies. The investigators had 12 families 
with socially figgressive, predelinquent boys come to a prestigious re- 
search institute for treatment of their boys' behavior problems. Half of 
the families received group behavioral intervention focused on the treat- 
ment of specific behavior problems (Patterson, Cobb, & Ray, 1972) and half 



a plausible, leaderless group placebo treatment. Expectations for change 
remained high for subjects in both groups. Observers collecting data In 
the homes vjere kept uninformed regarding group membership. The experi- 
mental families showed a significant change while the placebo families 
remair^d unchanged. It is hypothesized by the present author that parents 
of deviant children have iess control over the behavior of their children 
t I the parents of *'nornjal*' children, making it difficult for the former 
to "fake*' the d^^ta as "normal" families could in the Johnson and Lobitz 
(1972) study. 

No studies have yet been conducted which examine the effects of the 
observer's expectancies upon the behavior of the subjects ' . naturalistic 
observation (Johnson & Bolstad, 1972), f\osenthal's review of experimenter 
effects in studies of human learning and ability, psycho-physical judge- 
ment, reaction time, inkblot tests, structured laboratory interviews, and 
person perception suggests this possibility. Critiques point out errors 
?n Rosenthal's analysis and interpretation of the data (Barber S Silver, 
1968; Snow, 1968; ^horndike, I968), but the possibility for observer 
expectancy effects on the subject's behavior remaf.iL. However, this 
author sees the effects of experimenter expectancies upon the observe'''s 
data recording b ehavior as a r.ore serious methodological problem. 

Expectancy Effects upon the Observer 's Behav ior 

A different expectancy effect is illustrated in the physical sciences 
by the case of the infamous N-rays (Rostand, I96O). In 1903, a distin- 
guished physicist, M. Rene' Blondlot, Professor of Science at the 
University of Nancy, reported a discovery during his research on X-rays. 



Blondlot came across new rays quite distinct from X-rays. They were 
stronger in that they could penetrate metals and a great many other sub- 
stances normally opaque to all known spectral radiation. In particular, 
when they struck a small spark or flame or any luminous object, they in- 
creased the brightness of these sources of light. He chose to call them 
'*K'-reys" to honor the site of their discovery. For two years physicists 
rep; Seated Blondlot*s findings to the point of producing photographs of 
the effects of the N-rays upon electric sparks and by means of prisms, 
lenses, and other measures independently assessing the wave lengths of 
the N-rays v;ith good agreement. The reflective and refractive properties 
of the N-rays were shown to be unique, supportina the significance of the 
discovery. Such unintended distortions of the data by a ^^^roup of re- 
spected scientists continued to grov; through the two-year period until 
skeptics with opposing biases accumulated evidence to the contrary. 
Rostand (i960) attributed t*e collective delusion to pre^conceived ideas 
and auto-suggestion coupled v;ith the possibility of an overzealous labora- 
tory assistant bent on flattery or deception. 

Observer bias may also be ^ significant problem in the behavioral 
sciences today. Current research in behavior modification rel ?es almost 
exclusively upon naturalistic observation as the method of 'iata collection 
and the criterion of t'"eatment effectiveness. Various reviews (Johnson S 
Bolstand, 1972; O'Leary & Kent, 1972) document the fallibility of the 
human observer as a data collector. 

The problem of observer ^bias has received less attention from Rosen- 
thal and his colleagues than experimenter bias and demand characteristics. 
However, Rosenthal (1966, p. \k) presents the most complete catalogue of 



possible sources of observer bias in the literature with documentation 
from the various sciences. Observer bias may occur in the form of re^ 
cording errors (Kennedy & Uphoff, 1939; Rosenthal, Friedman, Johnson, 
Fode, Schill, White, & Vikan, 196^) where an average of 1? of the re- 
cordings v/ere in error and 71^ of the errors v/ere biased in the direction 
of the experimental hypothesis, computational errors (Laszlo S Rosenthal, 
1967; Rosenthal et_ aj_. , 156^; Rosenthal S Hall, 1968) which, when re- 
checked, showed errors by 65% of the 3^ experimenters, of v/hich 73^ of 
the errors v;ere biased, interpretive errors (Sfuith & Hyman, 1950) where 
recordings of interviews matched for content were interpreted differently 
as a function of the political labels placed on the respondents being 
interviewed, and i ntent ional errors (Azrin, Holz, Ulrich, & Goldiamond, 
1961 ; Rosenthal o- Lawson , 196^) where undergraduates in laboratory psycho- 
logy classes distorted data to confirm well-known theories of learning 
and personal i ty. 

Rosenthal (1966) considers interpretive errors as the least difficult 
to control as the data upon which interpretations are based are generally 
open to public scrutiny and reinterpretation. Scientific integrity and 
failure to replicate tend to prevent intentional errors. Computational 
errors may be controlled by careful rechecking of the data. Least public 
and most difficult to control are the recording errors made by observers. 
The major focus of thci present study is the effect of the experimenter's 
expectancies, directly communicated, upon the recording errors made by 
observers. As behavior modifiers use naturalistic observation as their 
sole criterion of treatment outcomes and rarely control for the effects 
of observer bias (Kass & O'Leary, 1970; PawlicKi, 136?)^ it behooves them 



to carefully study the circumstances under which observer bias occurs. 
The few studies where observer bias has been systematically evaluated 
will be reviewed below. Special attention will be given to the conditions 
associated with the- occurrence of bias. 

Azrin et_aj_. (1966) had untrained, undergraduate observers track 
"expressions of opinion" by adults with whom they were conversing. When 
observers were given an operant interpretation of the phenomenon under 
study, observations were the mirror image of later reports when observers 
were exposed to a psychodynamic reinterpretation. It was unlikel; that a 
slight modification in the experimental procedures (shifting from extinc- 
tion to disagreement) could produce the highly significant differences 
reported. Simultaneous observations of the same phenomenon from audio- 
tape recordings by a group of the student observers produced very poor 
inter- and i ntra-observer agreement. Use of a confederate during a 
replication of the study revealed fabrication of the data to confirm the 
theoretical notions .-advanced by the class instructor. A further attempt 
to replicate the study with graduate student observers failed, confirming 
that the results originally ^ported by the undergraduate observers were 
due to intentional errors. 

Rapp (1965), cited in Rosenthal (I966, p. 21), had eight pairs of 
observers describe the behavior of a given nursery school child for one 
minute. A member of each pair had been falsely told that the child under 
observation was feeling "under par" and the other that the child was 
"above par." Seven of the eight pairs of observers wrote descriptions 
that differed significantly in the direction of the expectations given 
them. Clearly, the definitions of such global behaviors as "above and 
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below par" are vague. The description of the study suggests that any 
measures of Inter-observer agreement taken prior to the differential 
biasing of the observer pairs would have been low, 

Scott, Burton, and Yarrow (196?) compared the observations of an in- 
formed observer (Scott) with uninformed observers as they observed the 
same nursery school child's behavior. Inter-observer agreement on 12 
discrete categories of peer interaction was relatively low (,54), but 
when categories were combined to form frequencies of "positive" and "nega- 
tive" peer interactions, agreement rose to ,89* Both sets of observations 
confirmed the experimental hypothesis but the informed observer's results 
provided significantly stronger support. The amount of training, exper- 
ience, and background of the uninformed observers used in this study 
were unspecified. It Is difficult to determine whether the differences 
were due to the degree of information given the two sets of observers or 
to selection differences. Furthermore, the small number of informed 
observers (N^ = 1) makes generalization to other informed observers risky, 

A field study employing uninformed "calibrating" observers to assess 
the accuracy and objectivity of a staff of informed observers was re- 
ported by Skindrud (1972), The two calibrating observers were given the 
same training the informed observers but were uninformed as to the 
treatment or "deviant" vs, "normal" status of the families observed in 
their homes. There was a significant tendency for the uninformed cali- 
brating observers to underestimate the deviant behavior relative to the 
informed observers across all treatment conditions. However, no observer 
bias was found with the relatively small number of paired observations 
available for study. The relatively high reliabilities reported by a 



stringent measure of observer agreement (82?) may have precluded the 
occurrence of bias even with the differing amounts of information available 
to the two sets of observers. Howeyer, a test for observer bias under a 
second condition where i:he informed observers were unaware of monitoring 
for accuracy and objectivity and where observer agreement is known to drop 
significantly (Reid, 1970; Romancyzk, Kent, Diament, & O'Leary, 1971) also 
produced no_ measurabl e bias. The small N_, a possible selection confound, 
and an incomplete design make these results tentative. A large-scale 
replication of such a field study in which expectancy effects upon both 
observers ' and subjects ' behavior may occur should be carried out by be- 
havior modifiers when feasible. 

An attempt was made to more systematically evaluate observer bias in 
a simulation of naturalistic observation. Kass and O'Leary (1970) trained 
27 undergraduate observers in a nine-category code of disruptive classroom 
behaviors. Groups informed, uninformed, and misinformed as to the effects 
of loud and soft teacher reprimands on disruptive classroom behavior coded 
videotape recordings of classroom interaction. The rates of disruptive 
behavior reported differed significantly (£ = 7.67; df = 2, 2k; ^ [_ .005) 
in the direction of the expectations given the observers. However, 
O'Leary and Kent (1972), after a re-analysis of the Kass and O'Leary (I970) 
data, report that the results were confounded by a tendency for the groups 
to drift apart on code definitions. Johnson and Bolstad (1972) point out 
tha^ observer drift and observer bias may be the same phenomenon in this 
case. If observer drift is prevented by anchoring observers to standard 
code definitions, observer bias should be less likely to occur. Had Kass 
and O'Leary (1970) checked observer agreement across groups or observer 
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accuracy against some outside criterion during data collection, observer 
drift, and, consequently, observer bias, may have beer minimized. 

Summary 

There is sufficient evidence that uncontrolled expectancy effects may 
pose a major threat to the internal validity of exper:mental-f ield studies 
under certain conditions. Demand characteristics may confound the ob- 
servations for relatively "normal" families (Johnson & Lobitz, 1972). 
However, demand characteristics had no measurable effect on the families 
of grossly deviant children (Walter & Gilmore, 1972). Since most child 
behavior therapy is designed for deviant cases, this may not be a major 
problem in the evaluation of therapy outcomes. 

Observer bias may prove a more serious tfireat. The evidence existing 
prior to the present study suggests that observer bias is likely to occur 
where undergraduate observers are faced with a difficult observation task 
(Azrin et_ aj^. , 1966), where global or ambiguously defined code categories 
such as "expressions of opinion" (Azrin et^ aj[. , 1966), "above and below 
par" (Rapp, 1965), and "positive and negative peer interactions" (Scott 
et^ aj^. , 1967) are used, or where observer drift from standard code defi- 
nitions is not controlled (Kass S O'Leary, 1970). Unfortunately, only 
some of the conditions where observer bias may exist have been adequately 
investigated. Does observer bias occur under more carefully controlled 
observation conditions, e.g., where well-trained, mature observers, 
discrete code categories, and precautions to prevent observer drift are 
employed? One preliminary study (Skindrud, 1972) which used behavioral ly 
defined code categories and mature women observers monitored for observer 
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agreement during data collection suggested that observer bias is minimal 
under these conditions. 

Differential sensitivity to the dependent variables of a study have 
been reported in two of the studies reviewed. Kass and O'Leary (1970) 
noted that their uninformed "control group reported lower levels of dis- 
ruptive behavior" and "had a lower level of motivation than the other two 
groups" (pp. 13-1^). Skindrud (1972) found that his uninformed observers 
reported significantly lower frequencies of the dependent variables over 
all treatment conditions. Only the informed observers knew which 13 of 
the 29 code categories were the dependent variables of interest. Such 
observer differences in sensitivity to the dependent variables of a study 
are likely to threaten the internal validity only where sensitivity is 
proportional to the absolute rate of the depcnder.t variable (sec Skindrud, 
1972) or where observers with differing sensitivities are not randomly 
assigned to treatment conditions. This finding requires replication as 
it was not documented statistically by Kass and O'Leary (1970) and may 
be due to a possible selection confound in the Skindrud (1972) study. 

Obj ect ives of the Present Study 

The present study had three general objectives: 

(1) The first was to replicate the findings of Kass and O'Leary (1970) 
and Skindrud (1972) that informing observers of the predicted outcomes 
and variables of a study sensitizes observers to the dependent variables 
across all treatment conditions. It was predicted that informed groups 
would report higher frequencies of the dependent variables throughout the 
study than an uninformed control group. 
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(2) The second and major objective of the present study was to cross- 
validate the Kass and O'Leary (1970) study with a different coding system, 
population of observers, and rationale for the manipulation of observer 
expectancies. The present study also attempted to control for certain 
deficiencies in previous studies by careful definition of code categories, 
extensive observer training, and monitoring observer accuracy during data 
collection. All were hypothesized to control observer drift and mi nimize 
observer bias. It was predicted that a powerfu 1 research design and mani- 
pulation of observer expectancies would produce bias in spite of the con- 
trols for the observer drift confound in the Kass and O'Leary (1970) 
study instituteH above. * 

(3) Assuming that evidence of observer bias is obtained, a third ob- 
jective was to examine possible correlates of observer bias (e.g., obser- 
ver accuracy, strength of the expectancy manipulation, behavioral 
specificity of the code categories, etc.) and develop a theory predicting 
the circumstances under which observer bias is maximized and minimized. 

Design of the Present Study 

Twenty-eight mature women were recruited from the community and 
trained in a behav iora 1 1 y-def ined , 2i-category code of family interaction 
Observers were assigned randomly to three experimental groups and given 
different expectancy rationales about the outcomes of the study for which 
they would be collecting data. All groups were told they would be ob- 
serving a family under a father-present and father-absent condition. 
However, one group was led to expect an increase, another a decrease, 
and a third no change in the rate of deviant behavior for the family* 
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members as conditions changed from father-present to father-absent None 
of the groups was told they would be observing identical videotape re- 
cordings of family interaction permitting comparison of observer data 
across groups. To control actual changes in deviant behavior, videotapes 
had been edited to match rates across father-present and father-absent 
conditions. To control selection effects due to observer differences in 
sensitivity to deviant behavior prior to expectancy manipulation, observers 
were matched on reported rates and randomly assigned to groups. To control 
observer drift from code definitions, overt random checks of observer 
accuracy were made throughout data collection. To control seq'.:cnce effects 
due to observer fatigue or practice, order of presentation of the father- 
present ond father-absent videotapes was counterbalanced within groups. 
A two" J/ analyses of variance (expectancy grouos x treatment conditions) 
was used to lest for differentia^ observer sensitivities (main effects) 
and observer bias (interactions). 



CHAPTER II 
METHOD 

Subjects ( Observers) 

Recruitment and selection . An advertisement was placed in the help 

wanted section of the local newspaper which read: 

WOMEN OBSERVER ASSISTANTS NEEDED for interesting research 
project in child psychology. Requires 3 weeks training 
and 2-3 weeks work. Must be ove. 21 years, married, 
preferably with children. 

After Initial screening to ensure satisfaction of the age, marital status, 
and scheduling requirements, hB applicants were given a battery of apti- 
tude tests designed to select those easiest to train for the observation 
task. Tests administered included the Minnesota Clerical Test (Psycho- 
logical Corporation), the numerical reasoning subtest of the Employee 

Aptitude Survey (Psychological Services), a work sample of the observation 

2 

task developed by the investigator, and the Bendig (1956) short form of 

the Taylor Manifest Anxiety Scale. The last test was not used for selec- 

3 

tion but given for a separate study. The 30 applicants scoring highest 
on the three selection measures were hired as "observer trainees." 

Al! trainees agreed to a contract requiring them to complete the 
study within a limited time in order to receive payment. Two dropped out 
during the first week. The remaining 28 constituted the subjects of the 
study. 

Observer training . The trainees were divided into three groups for 
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optimum training size. Ninety-minute training sessions were scheduled 
four days a week for three weeks. Sessions were held in a 13' x 20* room 
containing an Ampex 6000 videotape recorder, a Setchel 1-Carl son 23" TV 
monitor, and chairs end clipboards for the trainees. A shelf with two 
dozen videotapes was prominently displayed to support the illusion of 
participation as data collectors in a large scale research project. 

The observation task involved coding videotape recordings of family 
interaction according to a 21-cat.egory family interaction code based on 
the system developed by Paltersc^n, Ray, Shaw, and Cobb (1969). One family 
member was designated the subject of the observation. Observers focused 
on the behavior of the subject, recorded his behavior and the reactions 
of other family members to his behavior. They repealed this cycle every 
six seconds so that a sequence of encoded interactions between the subject 
and other family members was obtained. Every 30 seconds observers heard 
a tone to signal them to move down a line on their protocol sheets. Each 
sheet was designed to accomnnodate five minutes of family interaction. 

The training program consisted of the following steps: 

(1) Each of the trainees was given a manual (see Appendix A) and set 
of flashcards for the 21-calegory family interaction code. Trainees were 
told to familiarize themselves with the code definitions so they could 
correctly repeat the elements of each definition upon presentation of all 
21 flashcards prior to the first training session. 

(2) The first three training sessions began with written tests on the 
code definitions. Trainees watched playback of a five'-mlnute recording 
of simple family interaction while the trainer read (nK)deled) the correct 
coding of the behavior of one of the family members. Then they practiced 
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coding simply that one subject's behavior . Trainees scored their protocol 
sheets f rom feedback on the correct coding of the training segment pro~ 
duced by two trainers who had repeatedly coded the segment until both 
agreed 100? on the code entries. At the end of each session trainees 
attempted coding typical family Interaction from a set of five-minute 
"test recordings.*' 

(3) The next seven training sessions involved about an hour's practice 
coding a five-minute segment o^' family interaction . The trainer usually 
modeled the correct coding. Then trainees coded the same tape and were 
given feedback on the "standard criterion coding" for that segment of 
videotape. The final half-hour of the session was again a test of the 
trainees* progress at coding interaction with a new "test tape" and feed- 
back on accuracy. 

(A) The final two days of training, observers were asked to code yet 
another family. In actual fact, it was the same family to be used in the 
present study, but none of the observers was aware of this. They coded 
25 minutes of videotape each session with no feedback regarding accuracy. 
These data provided a stable baseline measurement of the dependent vari- 
able on which to m=^tch experimental groups. Again, the last half-hour of 
both sessions was devoted to coding tests with feedback. 

Training was conducted by the investigator and two observers from the 
Social Learning Project at the Oregon Research Institute experienced in 
the Patterson e^ aj^. (1969) family interaction code. The investigator 
and one of the experienced observers were present at all of the training 
sessions. One trainer operated the videotape equipment and the other 
modeled the coding of interaction, provided feedback on accuracy, and 



17 



clarified code d^f nitions as needed. Working in overlapping pairs tended 
to ensure that the trainer:, remained consistent on their definition of 
code catagories and did not 'Mrift apart'* ove\ the training period. 

Observer accuracy during training . The objective of the training 
program was 10% accuracy with the trainers* standard criterion coding of 
the test recordings. The accuracy measure was a stringent one. Observers 
had to record the same code as the criterion within a 12-secono limit of 
the corresponding criterion code entry without breaking the "stream of . 
behavior" to score one agreement . Percent accuracy was computed by 
dividing total agreements by total disagreements. During the last week 
of training, mean observer accuracy was and ranged from 5.^% to 70^. 
On the final day of training two coding tests were administered with a 
mean accuracy of 68%. 

The reader should make a clear distinction between the observer accu- 
racy measure used in the present study and observer agreement between 

r 

pairs of observers commonly used in exper i menta 1 -f i el d studies."^ Observer 
agreement is frequently higher than observer accuracy, especially v/here 
the standard criterion coding of the videotape recording is a "fine- 
grained" one. The staff of professional observers employed by the Social 
Learning Project using the Patterson et^ £1,. (1969) code average 
agreement on their field observations but only (>k% accuracy when compared 
to standard codings of videotaped interaction.^ 

Preparat ion of V ideotape Record ings 

An intact family known from a research project on "normal" families 
was contacted to obtain permission for videotape recording of family 
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interaction in the home. The family consisted of both parents and beys 
aged three, seven, and nine years. None of the family members had under- 
gone psychiatric or psychological treatment of any kind. They were known 
to be a relatively relaxed family with three active boys and wei e con- 
sidered good subjects for the recording of natural family interaction 
to be used in the present study. They readily consented to the recording 
with compensation at the rate of $7.50 per hour. 

Four videotape recordings were ob ained with all family menibers pre- 
sent and four with all except the father present. Since the three-year- 
old boy, Craig, served as the subject for all of the observations, the 
tapes will be generally referred to as the ''Craig family tapes.' The two 
sets will be specifically referred to as the "father-present" (FP) and 
"father-absent" (KA) tapes, respectiv:,ly . Al i videotaping was done during 
and right after the dinner hour so that setting differences (dining vs. 
living room) were controlled across the two sets of tapes. Extra video- 
tape recording was obtained in both settings so that it would be possible 
to match both FP and FA tapes on the class of behaviors to be used as the 
dependent variable in this study. 

During a pilot study of the design, the entire set of eight videotapes 
was coded by six observers trained in the same coding system used in the 
present study. Analysis of the results of the pilot study indicated that 
the tv/o sets of tapes were not matched on the dependent variable.^ Conse- 
quently, five-minute segments from the FA tapes were juggled with extra 
FA segments until the mean rates < f the dependent variable were matched 
across the two setb of FP and FP tapes U = 0-31; df = 38; n.s. at .50 
1 evel ) . 
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Procedures 

The schedule for training and data collection sessions and general 
design of the study, is illustrated in Figure 1. 

Variables controlled by the design . The present design controls for 
(l) actual changes in rates of deviant behavior from FP to FA tapes 
(p. 18), (2) selection confounds due to observer differences in sensiti- 
vity to the dependent variables prior to expectancy manipulation, (3) 
observer drift from standard code definitions (p. 30-31) 1 (^) sequence 
effects from the order ?n vvhich FP and FA tapes v;ere coded due to observer 
fatigue, boredom, or practice. Unequal group Ns viere used to increase the 
power of t!»at part of the design assessing observer bias. 

Control of the potential selection and sensitivity confounds was 

achieved by rank ordering observers on their coding of the "deviant be- 

g 

haviors" in the baseline tapes. Trios of observers with similar rankings 
were formed and members randomly assigned to the three experimental groups. 
Table 1 compares the groups on their baseline observations of deviant be- 
havior in the Craig family. 

One of the three expectance rationales v;as randomly assigned to each 
of the experimental groups. Members of the Control group were then reas- 
signed randomly to the Increase and Decrease groups until they numbered 11 
each. Six observers remained in the Control group. The purpose of this 
reassignment was to maximize the possibility of interaction between .i- 
crease and Decrease groups and consequently the possibility of finding 
observer bias. 

Possible sequence effects vjere controlled by counterbalancing the 
order in which the FP and FA tapes were presented v;ithin each expectancy 
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Table 1 

A Comparison of the Three Experimental Groups on 
a Baseline Measure of the Dependent Variable 





1 ncrease 


Control 


Decrease 


Mean rate of 








deviant behaviors 








per 5 nii nutes'*^ 


5-63 


G.ko 


5.75 


Standard 








deviation 


1.^8 


1.73 


1.60 



F^= 0.^^; df = 2, 25; N.S. at the .25 level 

group. Half the members coded the tapes in an FP-FA order and the re- 
maining half in an FA-FP order two weeks later. 

The three-week ^'layoff* for the FA-FP observers between training and 
data 'collection could have produced differences in observer accuracy due 
to inactivity and consequent deterioration of coding skills. To counter- 
act such a trend, one extra training session was scheduled for the FA-FP 
observers just prior to their two-week data collection period. A compari- 
son of mean observer accuracies for Lhe two sets of counterbalanced sub- 
groups on 11-9-71 and again with the FA-FP subgroups on I2-I-7I resulted 
in no significant differences (£ = 0.33; df = 2, 38; N.S. at .25 level). 

Han i pul at ion of the Independent Va r i a b 1 e 

Four one-hour observation sessions were held each week. FP tapes 
were coded one week and FA tapes another to simulate the collection of 
consecutive baseline and treatment observations in the field. Different 
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expectancy rationales were presented to each group -at the beginning of 
the first week and repeated at the beginning of the second week. The 
groups were led to expect different experimental outcomes and to believe 
they were each collecting data on different sets of FA tapes. Increase, 
Control, and Decrease subgroups were scheduled on the same day but with 
a half-hour between so that members from different expectancy groups 
did not "run into" each other entering and leaving the obsei'vation room. 
There was no evidence either from the post-test questionnaires or from 
informal conversation that any of the observers from different expectancy 
groups were aware they were viewing ident ical FA tapes. 

To arouse observer interest in the outcome of the pseudo-study and 
lend credibility to the rationales presented, one of the three principal 
investigators from the Social Learning Project (G. R. Patterson, J. B. 
Reid, or L. A. Hamerlynck) accompanied the investigator during each of 
the presentations. The accompanying visitor was introduced as "one of 
the child psychologists at the Oregon Research institute Interested in 
the outcome of the project." 

The following rationales were presented to the FP-FA order subgroups: 

Week One 

All three groups were told : "You will be coding two sets of 
videotapes of family interaction: a set made with the father present 
and, next week, a set with the father absent. 

In addi tlon , the Control group was told : "The purpose of this 
study is to determine the effect of the father's presence upon family 
interaction." 

In addition, both the Increase and Decrease groups were told: 
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"These videotapes were made of a family referred for the treatment of 
their boys' behavior problems- Both parents were specificdlly concerned 
about Craig's increasingly disruptive behavior— his high rate of yells, 
whines, and generally aversive behaviors. They wished to bring them 
under control before Craig entered school and became a behavior problem 
there, A series of videotape recordings were made in their home at 
various stages of the treatment program under two conditions: with the 
father present and absent. All of the groups of observers helping 
collect data for this study will see a set of FP tapes made before the 
family received any kind of treatment. However, each group of observers 
will see a second set of tapes with the FA, made at different times in 
the treatment program," (A phony research design corresponding to the 
above description was sketched on the board during the presentation to 
the Increase and Decrease groups. See Figure 2,) 



Figure 2 

The Phony Research Design Presented to the Increase and 
Decrease Groups as Part of the Expectancy Rationale 



Family interven- 
tion condition 



Videotape record 
ing condition 



Increase group 
tapes 

Control group 
tapes 

Decrease group 
tapes 



Basel i ne 



Post- inter vent ion 



(Time 1 ine) 
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In addition , the I ncrease group was told : ' "This group wi 1 1 see 
a set of FA tapes next week which were made prior to any treatment. You 
will be helping us evaluate the effect of the father's absence upon the 
rate of deviant behaviors, particularly Craig's, the child with the most 
behavior problems. By 'deviant behaviors,' I mean those listed in the 
family interaction coding system which are generally undesirable, 
specifically: crying (CR), threatening commands (CN) , disapprovals (D I ) , 
dependent requests (DP), destructiveness (DS) , high rate behaviors (HR) , 
humiliations (HU) , noncompliances (NC) , hitting (PN) , teases (TE) , whines 
(WH), and yells (YE), We predict that with only one parent there to 
monitor his behavior, Craig's rate of deviant behavior will significantly 
increase. Furthermore, some preliminary data from a field study by one 
of our research assistants strongly suggests that this is the effect of 
the father's absence. Such an effect will be easier to document from 
our intensive study of videotapes than we could obtain in the field. We 
are giving you this information as we have found that observer morale is 
improved by informing observers of the purpose of the study for which 
they are collecting data," 

I n addit ion , the Decrease group was told : "This group will see 
a set of FA tapes next week which were made after both parents had been 
interviewed by the treatment staff and undergone intensive training in 
child management procedures. We know from considerable research that 
such training greatly improves parents' ability to manage the behavior 
problems of their children. We are so confident of these child manage- 
ment procedures we are predicting that after treatment one parent will 
be able to manage Craig's behavior better than both parents could before. 
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We expect to see a significant drop in Craig's deviant behaviors as we 
move from coding the FP to the FA tapes in this group. By 'deviant 
behaviors,' I mean those codes in the family interaction coding system 
which are generally undesirable, specifically; crying (CR) , threatening 
commands (CN) , disapprovals (D I ) , dependent requests (DP), destructive- 
ness (DS), high rate behaviors (HR) , humiliations (HU) , noncompliances 
(NC), hitting (PN) , teases (TE) , whines (WH) , and yells (YE). We are 
giving you this information as we have found that observer morale is 
improved by informing observers of the purpose of the study for which 
they are collecting data." 

Both Increase and Decrease groups we re told : "At several points 
during data collection, a count of the number of deviant behaviors you 
record on a particular five-minute segment will be made and recorded by 
one of the trainers. This data will not be shared with the other ob- 
servers. We wish to get a random sampling of the data to see if there 
are trends supporting our predictions." 

All three groups were told : "Each day your trainers will ran- 
domly select one of the five segments you have coded for an accuracy 
check against a criterion coding of the same. Do the best job you can." 
Second Week 

The first week's expectancy rationales were reviewed and ela- 
borated at the beginning of the second v/eek when the observers returned 
to begin coding the set of FA tapes. 

The Increase group was told : "As you recall, we predicted a 
significant increase in Craig's rate of deviant behavior on the FA tapes 
you'll be coding this week. In fact, we believe there will be an increase 
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in deviant behavior for all family members in the father's absence, 
including, for example, more teases (TE) from the brothers and more 
disapprovals (Dl) and threats (CN) from the mother. We have some pre- 
liminary data from -last week's deviant behavior counts for the tapes you 
coded." (Figure 3' contains the graph of the data sketched on the board 
for the increase group.) "As you can see from the preliminary data I've 
graphed on the board, there was an average of eight deviant behaviors 
per five minutes with the farher present. We predict that the presence 
of only one parent will bring the average rate up to 12 deviant behaviors 
per five minutes, about a 33% increase. You may not immediately notice 
an increase in deviant behavior on the FA tapes as the rates vary tre- 
mendously from one segment to another. However, the overall rates should 
show an increase frorr: the FP to the FA set." 

Tht Decrease group was told : "As you recall, we predicted a 
significant decrease in Craig's rate of deviant behavior on the set of 
FA tapes you will be codinq this week. in fact, we believe the behavior 
of all family members will improve as a result of treatment. For example, 
there should be fewer teases (TE) by the brothers and a smaller number of 
disapprovals (Dl) and threats (CN) by the mother. The intensive training 
in child management procedures should allow one parent alone to more ef- 
fectively manage the behavior of the children than both parents could 
before such training. We have some preliminary data from last week's 
deviant behavior counts for the tapes you coded." (Figure k contains the 
graph sketched on the board for this Decrease S-^bqroup.) "As you can 
see from the preliminary data I've graphed on the board, there was an 
average of nine deviant behaviors per five minutss. We predict that 
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intensrve training in child management procedures will bring the rate 
down to about six deviant behavior, per five-minute observations, about 
a 33% decrease. You may not immediately notice a decrease in deviant 
behavior on the FA "tapes as the rates vary tremendously from one segment 
to another. Ho ever, the overall rates should show a decrease from the 
FP to the FA tapes. Also, we asked the mother not to use the more 
evident child management procedures while we were videotaping, such as 
placing Craig in isolation following each behavior problem as this would 
greatly disrupt videotaping. Consequently, the change in her child 
management procedures may not be obvious." 
(End of rationale.) 
The presentation of the expectancy rationales to the FA-FP subgroups 
two weeks later was identical to that used with the FP-FA subgroups with 
one exception--it was made clear that they would be seeing the FA tapes 
iLHSt.. The direction of change in rate of deviant behaviors was sketched 
on the board for each group so there would be no confusion about what to 
expect in spite of the unnatural ordering of the videotapes, viz.. post- 
treatment tapes before pre-treatment tapes, etc. 

On the day all group members returned for post-testing and collection 
of their paychecks, all were debriefed as to the true purpose and the 
actual design of the study. The need for research on observational methods 
of data collection in the evaluation of child behavior therapy was also 
stressed. Several months following their participation in the study all 
subjects were mailed a summary of the results to comply with ethical 
requirements ensuring the integrity of the experimenter in studies in- 
volving deception of subjects. 



30 

D ependent Variables and Eval uat ive Cri ter ia 

Observer expectancies . Two measures were used to determine whether 
observer expectancies were influenced by knowledge of the experimenter's 
hypothesis. In view of the possible reactive effects of such measures, 
only unobtrusive or post-test measures were administered. 

(1) Observer Assistant Inventory (Item 7). Throughout the training 
program an "Observer Trainee Inventory'* had been administered to assess 
the morale of the trainees. The original inventory was slightly revised 
by the addition of item 7, dealing with the experimenter's prediction 
(see Figure 5), and relabeling it the "Observer Assistant Inventory." 
The revised version was administered on the seventh day of data collec- 
tion as an unobtrusive measure of obser er expectancy. A copy of the 
complete inventory can be found in Appendix B. 

Figure 5 

Item 7 of the Observer Assistant Inventory 

The experimenter's prediction for the set of FA tapes seen 
by this group of observers was that the rale of deviant 
behavior in the family would: 

I I I I I I I 

+75^ +50^ +25?; 0% -25% -50% '75% 

or have 
unknown effects 

(2) Observer Assistant Questionnaire. A questionnaire was adminis- 
tered following all data collection to assess the observers' comprehen- 
sion of all elements of the rationale, their personal expectations for 
change, and any suspicions they had about the true purpose of the study. 
The investigator concealed all identifying data on each of the completed 
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questionnaires. Five graduate students acquainted with the design and 
expectancy rationales used in the study were asked to sort the shuffled, 
anonymous questionnaires into three categories according to their judge- 
ments of expectancy group membership based on responses given to the 
following questionnaire items: 

1. In several sentences give your understanding of what this 
study was about . 

2. What were you told about the history of the Craig family? 

3. As far as you can, indicate the variable being manipu- 
lated, the specific variables (behaviors) of interest to 
the investigator, and the investigator's prediction about 
the variables measured by your group's observations of 
the videotapes . 

What evidence or arguments were presented by the inves- 
tigator to support the prediction given your group? 

5. Did you have any personal expectations regarding the 
outcome of the study? If so, what were they and were 
you more motivated to see the investigator's prediction 
or your own confirmed by the results of this study? 

A copy of the complete questionnaire can be found in Appendix C and the 

instructions to the five judges in Appendix D. 

Observat ions of deviant behavior on the FP and FA tapes . Observers 

coded the behavior of the subject, Craig, and the respons'^s pf family 

members to his behavior every six seconds according to the procedures 

outlined in the section on observer training above. Twelve of the 21 

Codes v;ere regarded as deviant codes. The moan rate of the 12 deviant 

behaviors reported by the observers for all family members on the FP and 

FA tapes v/as the major dependent variable of the study. 

Observer Accuracy during Observat ion of the FP and FA Tapes 

Observers were told prior to the collection of observation data from 
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Craig tapes that their accuracy would be randomly spc»t checked during 
each of the eight data collection sessions (as included in the rationales 
presented to all three groups). One of the five segments of recorded 
family interaction was randomly selected from the 25-niinu^e tape 
and carefully codad and recoded by two of the trainers as outlined in 
the section on the preparation of the videotape recordings abov^. This 
standard coding serves as a common criterion against which observer accu- 
racy for all three expectancy groups could be measured. The mean ooser- 
ver accuracies for the three groups are outlined in Table 2. An £ tesc 
across the three expectancy groups suggests that there were no group 
differences during the two-week data collection period. 

Table 2 

Mean Observer Accuracies for the Three Expectancy Groups 
on Spot Checks Made During Observations of 
the Craig Family Videotapes 



Group 


N 


Father-present 


Fathc r-absent 


Grand mean'-'' 






tapes 


tapes 




I ncrease 


1 1 


59.0^ 


58.0% 


58.5% 


Contro 1 


6 




58.9% 




Decrease 


1 1 


57.0^ 


58.2% 


57.6% 


* F_ = 0.20; df 


= 2 


, 25; N.S. 5t .25 1 


eve) 





Speci f ic Hypotheses and Da t a Analysi s 

It was predicted that the presentation of parallel but opposing ra- 
tionales for the study would result in differing expectancies across 
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experimental groups as measured by the "Observer Assistant Inventory" 
and the "Observer Assistant Questionnaire." 

It was also predicted that such group expectancies would differen- 
tially affect the obse vations of the same videotape recordings of the 
Craig family across the three experimental groups such that: 

(1) The Control group would be less sensitive to the dependent 
variable and report lov;er frequencies cf deviant behavior across both 
the FP and FA conditions than the increase and Decrease groups, and 

(2) The frequencies of deviant behavior reported by the Increase and 
Decrease groups would interact across the FP and FA conditions attribu- 
table to confounding observer bias from differing group expectancies. 

Given evidence of confounding observer bias, it was predicted that 
bub-analyses would reveal relationships between the magnitude of observer 
bias and (a) temporal proximity to the expectancy manipulation, (b) de- 
viant behaviors targeted vs. .nontargeted for change in the Increase and 
Decrease rationales, and (c) observer accuracy. 
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CHAPTER I I I 



RESULTS 



The first section describes the effect of the expectancy manipulation 
upon two self-report measures of observer expectancy. The second section 
examines the effects of the expectancy manipulation upon the observations 
of deviant behavior. 

The Effect of the Expectancy Manipulation 

Observer Ass i stant I nventory . Twenty-sever of the observers responded 
to item 7 dviscrlbing their expectancies regarding experifi^ental outcome. 
This reflected their understanding of the research project on day seven 
of the data collection period. The mean responses of each expectancy 
group to item 7 are presented in Table 3. The means obtained roughly 
approximate the 33^ increase, 0% change, and 33% decrease predictions in- 
cuded in the expectancy rationales presented to the Increase, Control, 
and Decrease groups, respectively. 



Table 3 



Observers' Recall of the Investigator's Prediction 
Regarding Experimental Results 



Group 



I ncrease 



Control 



Decrease 



Group's mean 
response 



+26? 
(N = 11) 




-h2% 
(N = 10) 



F_ = 6.88; df = 2, 2k; £ < .01 
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Observer Assistant Questionnaire , Twenty-seven of the observers com- 
pleted the Observer Assistant Questionnaire (Appendix C) administered as 
a post-test measure of comprehension and acceptance of the expectancy 
rationales. A perfect assortment of the 27 questionnaires into appro- 
priate expectancy groups by the five judges would result in 55 correct 
assortments of the 11 Increase questionnaires, 25 correct assortments of 
the five Control questionnaires, and 55 correct assortments of the 11 
Decrease questionnaires. Of the grand total of 135 judgements, only 
seven were incorrect. The results are presented in Table h. 

Table k 

Judges' Assortments of 27 Post-test Questionnaires 
into Three Categories 



Judgement 


Group 


1 ncrease 


Control 


Decrease 


1 ncrease 


5k 


0 


1 


Contro 1 


I 


22 


2 


Decrease 


0 


3 


52 




(N = 11) 


(N = 5) 


(N = 11) 











= 2A0; cf = A; £^ < ,001 



Responses to questionnaire item 5 describing the observers' personal 
expectations were selected for sepaiate analysis. Again, a perfect 
assortment of the 27 quest ion.iai re responses by the five judges would 
result in 55, 25, and 55 correct assortments of the 11 Increase, five 
Control, and 11 Decrease group responses, respectively. Of the total of 
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135 judgements of persona' expectations, approximately half (66) were in 

9 

agreement with the expectancy rationale presented to their group. Only 
^% of the judgements (6) suggested personal expectations opposed to the 
experimenter's as presented. The members of the control group were 
generally without personal expectations. The results are presented in 
Table 5« Separate instructions to the judges for sorting the question- 
naire responses to item 5 are in Appendix E, 

Table 5 

Judgements of Observers' Personal Expectations 
Regarding Experimental Results 



Personal 
expectat ion 


Group 


Increase 


Control 


Decrease 


Increase 


19 


5 


0 


No change 


30 


20 


28 


Decrease 


6 


0 


27 




(N = 11) 


(N = 5) 


(N = 11) 











= ^5.1^; df = A; £ < ,001 



The Effect of Differential Expectations upon Reported Observations 

A 3 X 2 analysis of variance with repeated measures (Kirk, 1968, pp, 
279^281) permits a test of both of the predictions regarding the effect 
of the expectancy manipulation upon the reported observations of deviant 
behavior: 

(1) The predict" tv^.*. that knowledge of the specific behavior codes 
constituting the dependent variable of the study would produce higher 
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mean rates of deviant behavior of the Increase and Decrease groups than 
for the Control group, and 

(2) The prediction that the differing expectancies among the three 
groups would result in increases, no change, and decreases in the reports 
of deviant behavior as conditions changed from FP to FA. A graph of the 
results across baseline, '° FP, and FA conditions can be found in Figure 6. 

The mean rates of deviant behavior observed per five minutes of family 
Interaction are presented in Table 6. 

Table 6 

Mean Rates of Deviant Behavior per Five Minutes Reported 
by Expectancy Group across FP and FA Conditions 



GrouD 


FP 


FA 


1 ncrease 


7.273 


6.827 


(N = n ) 






Control 


8.075 


7.5h2 


(N = 6) 






Decrease 


8.198 


7.677 


(N = II) 













The 3x2 analysis of variance with repeated measures in Table 7 in- 
dicates no significant main effects across groups, failing to support 
hypothesis (l) above. The lack of a significant interaction between 
groups end FP and FA conditions fails to support hypothesis (2) above. 

Anci 1 lary Hypotheses 

A number of predictions regarding the specific conditions under which 
observer bias may occur were suggested at the end of Chapter II. They 
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FIGURE 6 



RATES OF DEVIANT 5FHAVI0RS REPORTED 
BY THREE GROUPS WITH DIFFERING EXrZCTATIONS 

(FP-FA & FA-FP COUNTERBALANCED) 
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Table 7 

Analysis of Variance with Repeated Measures for Mean Rates of 
Deviant Behavior by Expectancy Group 
Across FP and FA Conditions 



Source 


SS 


df 


MS 


F 


Between subjects 


160.151 


27 








Rows 


10.703 


2 


5.351 


0 


.90 


Subjects within groups 




25 


5.978 






Within subjects 


25.A61 


28 








Col umns 


2.S8k 


1 


2.58k 


2. 




Rows X columns 


0.163 


9 


0.081 


0 


.09 


Columns x subjects within 
groups 




25 








Total 


185.613 


55 









included the possibility that observer bias may be a function of temporal 
proximity to the presentation of the expectancy rationale, targeted vs. 
nontargeted deviant behaviors, and/or observer accuracy. Each of these 
predictions will be exanined below. 

The prediction that observer bias may occur only on the days when the 
expectancy rationale was presented (day one of the FP and FA conditions) 
and "wash out" on subsequent days was tested by plotting the data by days. 
Visual inspection of the data collected on day one of the FP and FA con- 
ditions vs, all other days, presented in Figure 7, does not suggest such 
an interaction. A repeated measures analysis of variance across the eight 
days of data presented in Table 8 i^ summarized in Table 9. No significant 
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Table 8 

Mean Rates of Deviant Behavior per Five Minutes Reported by 
Expectancy Group across FP and FA Conditions by Days 



Group 


FP 


FA 




1 


2 


3 


4 


1 


2 


3 


4 


1 ncrease 
( N = 11) 


6.818 


5.745 


8.473 


8.055 


7.873 


6.600 


6 


.345 


6.491 


Control 
(N = 6) 


7.567 


7.567 


8.500 


8.667 


8.867 


7.533 


7 


533 


7.433 


Decrease 
(N = 11) 


7.764 


6.691 


8.900 


9.436 


8.400 


7.436 


7 


255 


7.618 



Table 9 

Analysis of Variance with Pxepeated Measures for Mean Rates 
of Deviant Behavior by Expectancy Group Across 
FP and FA Conditions by Days 



Source 


SS 


df 


MS 


IF 


Between subjects 


640.552 


27 






Rows 


42.799 


2 


21.399 


0-89 


Subjects within groups 


597.754 


25 


23.910 




Within subjects 


547.769 


196 






Col umns 


138.777 


7 


19.825 


8.73" 


Rows X columns 


11.552 


14 


0.825 


0.36 


Columns x subjects within 
groups 


397.439 


175 


2.271 




Total 


1 ,188.321 


223 







* £_< .0005 



interaction between expectancy group and eight days of data collection 
were found. The temporal proximity hypothesis remains unsupported. (The 
significant columns effect was due to the fact that the sets of tapes 
were matched on rate of deviant behavior across FP and FA conditions, but 
not by days within conditions.) 

The reader will recall that certain deviant behavior codes were 
^'targeted" for change in the presentations of the expectancy rationales 
to the Increase and Decrease groups. In the first presentation of the 
rationales to the two groups, the experimenter strongly suggested that 
yells (ye), whines (WH) , and aversive behaviors (HR) would change. In 
the second presentation to both groups, it was suggested that changes 
would also be observed in teases (TE) , disapprovals (DI), and aversive 
commands (CN) . Visual :nspeotion of the data for these six deviant be- 
haviors presented in Figure 8 reveals no trends for crossed, sprayed, or 
monotonic interactions betv;cen expectancy group and FP-FA conditions. 

The data for targeted dc'iant behaviors are presented in Table 10. 

Table 10 

Mean Rates of Six Targeted Deviant Behaviors per Five Minutes 
by Expectancy Group across FP and FA Conditions 



Group 



FP 



FA 



I nc rease 
(N = 11) 

Control 
(N = 6) 

Decrease 
(N = 11) 



5.9^3 
6.155 



^.723 
if. 901 
5.2^)1 



Further analysis of the data requires that the possibility of se- 
quence effects resulting from the order in which the FP and FA tapes were 
presented be ruled out. Consequently, the data from the original analysis 
presented ?n Table 7 above were regrouped according to first and second 
presentation rather than by FP and FA tape sets. Visual inspection of 
this regrouping of the data presented in Figure 9 suggests no main effects 
due to the order of presentation such as could be attributed to practice, 
instrument decay, etc. 

The data for the three expectancy groups across first and second pre- 
sentations of videotape sets can be found in Table 11. 

Table 11 

Mean Rate of Deviant Behavior per Five Minutes by Expectancy Group 
Across First and Second Presentations of Tape Sets 



Group 


First set 


Second set 


I ncrease 


6.9^0 


7.0^10 


( N = 11) 






Control 


8.200 


7.725 


(N = 6) 






Decrease 


7.5^5 


8.295 


(N = 11) 













We may now regroup observers regardless of order in which they saw 
the '^P and FA tapes, permitting a test of the prediction that observer 
bias may be a function of observer accuracy. All the observers within 
each expectancy group v;ere ranked according to their mean observer 
accuracy obtained during the third week of the observer training pro- 
gram. The lowest one'-half of each of the expectancy groups was selected 
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out. Visual inspection of the low observer accuracy data presented in 
Figure 10 suggests the presence of observer bias, especially in the In- 
crease group. An analysis of variance with repeated measurers was run on 
the data presented in Table 12 to test the low observer accuracy hypo- 
thesis. The results are summarized in Table 13- The trends noted in 
Figure 10 were not statistically reliable, failing co support the low 
observer accuracy hypothesis. 

Table 12 

Mean Rates oi Deviant Behavior per Five Minutes by Expectancy Group 
Across FP and FA Conditions for Low Accuracy Observers 



Group 


FP 


FA 


1 ncrcase 


7. '♦67 


7.217 


(N = 6) 






Contro 1 


7.600 


6.883 


(N = 3) 






Decrease 


8.172 




(N = 6) 











Power Analysis of a Related Pes i gn 



Inferences regarding null hypotheses have been generally discouraged 
in the past. However, a number of statisticians (Bakan, 1966; Binder, 
1963; Grant, 1962; La Forge, 1967; Natrella, I96O; Nunnally, 1S60; 
Rozeboom, I96O) have suggested the use of confidence intervals and/or 
power analysis when inferences about a null hypothesis are of interest. 
In view of the failure to reject the null hypothesis of no observer bias 
in the original analysis presented in Table 8 above, it was decided 
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FIGURE 10 

RATES REPORTED BY 
LOW ACCURACY OBSERVERS 
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Table 13 

Analysis of Variance with Repeated Measures for Mean Rates of 
Deviant Behavior by Expectancy Group Across FP and FA 
Conditions for Low Accuracy Observers 



Source 


SS 


dT 


MS 


F_ 


Between subjects 


102.531 


\h 








Rows 


1 .919 


2 


0.959 


0. 


11 


Subjects within groups 


100.612 


12 


8.38i» 






Within subjects 


12.918 


15 








Col umns 


2.070 


1 


2.070 


2 


37 


Rows X columns 


0.379 


2 


0.190 


0 


.22 


Columns x subjexts within 
groups 


10.i»69 


12 


0.872 






Total 


1 15.^^9 


29 









to conduct a power analysis. Generation of a confidence interval around 
the null hypothesis of no observer bias would suggest what magnitude of 
observer bias would first be detected as significant by the present design. 

Unfortunately, too many parameters are unknown to permit a power 
analysis with a repeated measures analysis of variance design. However, 
a related design that achieves a crude test of statistical interaction 
between the Increase and Decrease groups across the FP and FA conditions 
is the tj-test of differences between differences (Walker S Lev, 1953, 
pp. 158, 166). Assuming no difference between Increase and Decrease 
groups in the FP condition, Ns for various alternate differences in the 
FA condition such as 0.8, 1.5, and 2.3 deviant behaviors per five minutes 



could be computed. Such "sprayed interactions" would suggest mean obser- 
ver biases in the two groups of five, 10, and 15^, respectively, when 

compared to a mean of 7-7^ deviant behaviors obtained in the FP condition. 

2 

Trial and error runs with s^ = 1.687, a"= .10 and g = .10, indi- 
cated that a minimal bias of 3% for each of the expectancy groups would 
be required before the t_-test would detect the bias as significant with 
an of 11 in each group. 

By extrapolation, it can be inferred that the repeated measures 
analysis of variance design used in this study would detect minimal 
biases of 5 " 10^ with the relatively large employed. In the judgement 
of this investigator, the design used was relatively sensitive to obser- 
ver bias. 



50 



CHAPTER IV ' 
DISCUSSION 

Conci us ions 

The highly significant differences obtained between groups on the two 
measures of observer expectancy strongly support the hypothesis regarding 
manipulation of group expectations. Observers not only comprehended and 
accurately recalled the predictions given their group but differed signi- 
ficantly across groups in their personal expectations of experimental 
outcomes. 

Differences In sensitivity to deviant behavior between the two informed 
(Increase and Decrease) and one uninformed (Control) groups were not ob- 
tained as hypothesized. These results fail to replicate the findings of 
Kass and O^Leary (1970) and Skindrud (1972) that observers informed of 
the dependent variables of a study report higher frequencies of those 
specific code categories throughout the study. This failure to replicate 
can be explained on the basis of much less extensive training given the 
Kass and 0*Leary (1970) observers (only six training sessions vs. 12 in 
the present study) and the absence of the poss i bl e sel ect ion confound i n 
the Skindrud (1972) study. 

Repeated analyses of the data failed to produce any evidence that 
the successful expectancy manipulation biased the observations. Whether 
only observation sessions in temporal proximity to the presentation of 
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the expectancy rationale, deviant behaviors targeted for change, or low 
accuracy observers were considered, no evidence for observer bias was 
obtained. While it is impossible to state that no biasing of the obser- 
vations occurred, it is possible with the extrapolation from ^he power 
analysis of a related design reported earlier to rule out observer bias 
of greater than 5 - 10^ per group in the present study. 

)\ Conceptua i Model 

The results of the present study, the leview of published studies on 
observer bias in Chapter I, and recent studies made available to this 
investigator since data collection for the present study, suggest a 
three-dimensional model of conditions contributing to observer bias in 
experimental-field studies. To the extent that such a model is valid, 
it could be used to predict the conditions under which observer bias are 
likely to occur and allow alternatives for the control of this confounding 
variable. The dimensions of the model can be seen in Figure M. 

Impact of the experimenter's expectancies upon observers . The first 
dimension deals with the degree to which observers are influenced by the 
experimenter's expectancies. In only a small minority of studies of 
child behavior therapy (4? according to Pawlicki, 1970), observers are 
kept totally uninformed of the experimenter's hypothesis, placing them 
at the *'weak*' end of the expectancy impact continuum. It is more likely 
that observers in most studies evaluating behavior therapy are well aware 
of the intended outcomes and have received some feedback regarding suc- 
cessful "early data returns," placing them in the middle column along 
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the continuum. The present study which employed a moderately strong 
manipulation of observer expectancies would be located here, A strong 
(but subtle) impact would occur where observers would additionally be 
exposed to experimenter reactions during data collection which selectively 
reinforce the reporting of confirming data. Displays of interest and 
comments on the part of the therapist-experimenter (e,g,, "Ummm,.,that 
data confirms our hypothesis," or, "Ah ha! Deviant behavior is beginning 
to drop as the family enters treatment,*') are almost unavoidable where 
experimenter-observer interaction is not prevented. Such an impact is 
found in the extreme right column of the expectancy continuum. 

Two small studies suggest that the location on the expectancy im- 
pact dimension is critical and does not interact with the other dimen- 
sions in the production of observer bias, Skindrud (1972), in a pilot 
study for the present investigation, trained six observers and ran them 
through the same design with only two procedural differences. The pilot 
study observers were trained on videotapes cf the Craig family, resulting 
in higher levels of observer accuracy than in the present study and, 
most critically, in addition to the expectancy manipulation, observers 
were exposed to experimenter reactions selectively reinforcing confirming 
data reports at eight points throughout data collection. (Deviant beha- 
vior counts were public rather than private in the Increase and Decrease 
groups and especial-ly attended to when they supported the experimenter's 
predictions,) A significant interaction was obtained in the predicted 
direction, suggesting the presence of observer bias ( = 6,^5; df = 2, 3; 

,10), O'Leary and Kent (1972) report a study with a multiple base- 
line design employing four behavioral ly specified code categories, high 



levels of observer agreement, an expectancy manipulation, and differential 
experimenter reactions to confirming and d i sconf i rmi ng reports by the 
observers. The results indicated a significant effect consistent for the 
two categories of behavior subject to the expectancy manipulation 
(Orienting and Vocalization) and absent for the two unmani pu 1 ated cate- 
gories (Play and Noise). These independent investigations (O'Leary S 
Kent, 1972; Skindrud, 1972) reveal the powerful biasing effect of experi- 
menter reactions during data collection in spite of high levels of obser- 
ver accuracy and agreement, 

Rosentf^l (I966, Chapter 13) reports diminishing or negative bias 
where attempts to reward reports of confirming data are excessive and 
obvious. Studies by Rosenthal and his colleagues found that graduate 
student experimenter-observers offered $5.00 per hour for doing a "good 
job" to obtain confirminn data from their subjects produced less con- 
firming data than those offered a standard rate of $2,00 per hour. 
Furthermore, there was a tendency for the highly rewarded experimenter- 
observers* data to correlate negatively with the expectancy manipulation. 
Post-experimental group discussion with the graduate student experimen- 
ters offered excessive rewards revealed that they were upset by the 
apparent attempt to bribe them to produce confirming data and were 
bending over backwards to be uninfluenced. 

In the case of the present study, the fact that post-test question- 
naire results did not produce responses indicating awareness of attempts 
to manipulate results and that pilot study findings demonstrated observer 
bias when experimenter reactions were added to the expectancy manipulation 
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suggest that the expectancy manipulation in the present study was neither 
obvious nor excessive. 

Observer accuracy . A second dimension of the conceptual model is the 
accuracy with which observers encode behaviors relative to standard defi- 
nitions of the code categories used. Many factors affect observer accu- 
racy. Major determinants of observer accuracy are the methods of assessing 
observer reliability. These methods can be roughly ordered on a continuum 
from those associated with high to low observer accuracy as follows: 

(1) Observer accuracy is maximized by comparing observer reports to 
standard criterion codings of videotapes of the same interaction coded 
and recoded by a pair of well-trained observers anchored to standard code 
definitions until 100^ agreement is obtained. Such a measure of observer 
accuracy is classified in the top row of the observer accuracy dimension. 

(2) Experimental-field studies using the practice of assessing inter- 
observer agreement by pairing observers and comparing reports will be 
classified within row two. Since high levels of observer agreement 
within groups do not guarantee high levels of agreement between groups 
due to observer drift from standard code definitions over time (O'Leary 

& Kent, 1972; Romancyzk et al . , 1972), this method is judged less rigorous 
than the preceeding one. 

(3) A common practice has been to train observers to acceptable levels 
of observer reliability and then send them on data collection assignments 
unmonitored for observer accuracy or agreement. As reported earlier, 
such a method produces immediate and dramatic drops in levels of agree- 
ment when overt monitoring terminates (Reid, 1970; Romancyzk et^ aj_. , 1972). 
Consequently, this method is classified in the third row from the top. 

ERLC 
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(A) A fourth method completely without rigor would be the use of ob- 
servers who were haphazardly trained without observer accuracy or agree- 
ment checks and unmonitored for observer reliability throughout data 
collection. Studies employing such a method would be found at the bottom 
of the continuum. 

The present study, together with a series of three studies conducted 
at the State University of New York at Stony Brook by O'Leary and his 
colleagues, shed some light on the relationship between observer accuracy 
and bias. The first of the Stony Brook studies (Kass & O'Leary, 1970), 
which has already been rr'Mev;ed, did not involve overt monitoring of ob~ 
server agreement during data collection and did produce significant obser- 
ver bias. Since the drop In observer agreement upon termination of overt 
monitoring is a wel 1 -repi icated finding, it can be presumed that observer 
accuracy and agreement were low during data collection for the Kass and 
O'Leary (1970) study. The second study in the Stony Brook series was a 
dissertation by Kent with a total of ^0 observers (O'Leary & Kent, 1972). 
Observers were broken into two groups of five within each expectation 
condition during the final three days of training and for the duration 
of the study. Throughout observers computed interobserver agreement 
across rotating, randomly-formed pairs within each group. Subsequent 
analyses of data from these observers (O'Leary & Kent, 1972) revealed 
a drifting apart on the definitions of seven of the nine code categories 
for the groups training and working independently. A third Investigation 
by the O'Leary group with a total of 20 observers (O'Leary & Kent, 
1971) attempted to control for observer drift by oividing observers into 
pairs and assigning five pairs to each expectancy condition by chance. 
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Assuming that observer drift is a random phenomenon {O^Lsp'-y ^ Kent, 1972), 
it was felt that observe drift would be statistically controlled and un- 
confounded with the effects of the expectancy manipulation. According to 
Kent (1972), the behavioral ratings of the observers in both the second 
and thi rd studies in the Stony Brook series were totally unbiased by the 
expectancy manipulations. 

Of the studies reviewed thus far, the present study would be classi- 
fied in the top row along the observer accuracy continuum, the second 
and third studies in the Stony Brook group in the second row, the first 
of the Stony Brook group (Kass & O'Leary, 1970) and the Scott, Burton, 
and Yarrow (I967) study in the third row, and the Rapp (I965) and A>r!n 
£L £l- ('9^') studies along the bottom. As can be seen, observer bias 
first appears along the continuum at the point where no method of overt ly 
monitoring observer agreement is used during data collection (row three 
from the top) in the Kass and O'Leary (1970) and Scott, Burton, and 
Yarrow (1967) studies. One can conclude that for the type of coding 
systems and expectancy manipulations used, overt monitoring of observer 
agreement during data collection may be a critical variable in the pre- 
vention of observer bias. One study (Skindrud, 1972) which attempted an 
indirect comparison of observer bias under overt and covert monitoring 
of observer agreement did not find bias under either condition. However, 
the study was limited by unequal error variances in the two observation 
conditions and relatively small N^s. Replication of the Skindrud (1972) 
study on a large scale with error variance in the covert monitoring 
condition controlled would be a strong test of the validity of the model 
presented here. 



58 

Observation code . The third dimension of the conceptual model deals 
with the degree to which code definitions in observational studies are 
specific and behavioral ly defined vs. global or ambivious. Global judge- 
ments are to be distinguished from specific code definitions in that 
they may contain a great number or variety of specific behavioral events 
and require some degree of inference not based on direct observation. 
Global categories such as "aggressive," "dependent," and "immature" 
would be comnx)n examples. 

Classification of codes as specific and behavioral is more difficult 
as it is possible to have a relatively specific code such as Yell or 
Whine without clear behavioral definition. The problem with the obser- 
vation of such code categories is they fall on continua with other 
categories such as Talk, How loud does a Yell or how nasal does a Whine 
have to be to qualify as a deviant behavior rather than Talk? Such 
categories are more difficult to define behaviorally (without the aid 
of instrumentation such as a decibel meter) than categories such as Hit 
or Command which appear more discrete. 

If code categories are either global or ambiguous they are classified 
within the row to the rear of the model. If they are both specific and 
behaviorally defined they are classified in the row to the front of the 
code definition dimension. 

Studies employing global judgement clearly are influenced by obser- 
ver expectancies. O'Leary and Kent (1972) asked observers to report 
their "perceptions" of change at the conclusion of the second study in 
the Stony Brook series. Within each of the two expectancy groups, half 
of the observers v/ere shown videotapes that contained the predicted 
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change and half saw tapes with no actual change in rates of disruptive 
behavior. Surprisingly, the observers' ratings of their global percep- 
tions of change were significantly associated with the expectancy ra- 
tionales presented their group but not with the actual changes occurring 
on the tapes coded. It will be recalled that when specific code cate- 
gories were used these same observers collected data with high levels of 
agreement and no evidence of observer bias. 

Observers in the studies by Rapp (1965), cited by Roi;enthal (1966), 
and by Azrin et^ £]_• (1961) employing global categories such as "above 
and below par" and "expression of personal opinions," respectively, were 
very susceptible to observer bias. 

No studies are available which used exclusively ambigious-specif ic 
code categories. The present study, the three Stony Brook studies, and 
the Scott, Burton, and Yarrow (196?) study all used relatively beha- 
vioral ly defined code categories. Had the Stony Brook studies used less 
behaviorally defined code categories, this investigator would have pre- 
dicted the occurrence of significant observer bias for all three studies. 
With less behav loral ly defined code categories, monitoring observer 
agreement during data collection may not have been sufficient to prevent 
observer drift in the direction of the expectanc/ nianipulat ion in the 
second and third studies of the Stony Brook series. O'Leary and Kent 
(1972) report some observer drift in all three of the studies. However, 
In the second and third studies apparently the drift from standard code 
definitions was not great or was uninfluenced by observer expectancies 
due to the overt monitoring. Dropping out overt monitoring in the first 
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(Kass S 0*Leary, 1970) study opened the flood gates to observer drift 
and significant bias resulted. 

Limitations of the Present Study 

The model above suggests the major limitation of the present study. 
Most experimental -field studies fall into rows two and three along with 
the observer accuracy dimension. They attempt to maintain quality data 
by the pairing of observers and checking observer agreement throughout 
data collection or omit observer agreement checks altogether at Lhe 
conclusion of observer training. The present study took precautions to 
prevent observer drift through the use of random observer accuracy checks 
against standard criterion codings of observed interaction throughout 
data collection. Consequently, generalization of the results to the 
bulk of exper imental -f ield studies employing naturalistic methods of 
data collection is qi' stionable. 

It does not appear that the external validity was curtailed signifi- 
cantly by the use of observations of videotaped rather than in vivo 
interaction. Observations were of relatively unstructured family inter- 
action recorded around the dinner hour in the home of tSe Craig family. 
Also, 0*Leary and Kent (^972) report comparisons of three observational 
media (in vivo , behind a one-way mirro', and via closed-circuit TV) 
where awareness of observation was held constant, controlling subject 
reactivity. No significant deferences occurred across media for an N[ 
of three observers per group except for one of the nine code categories, 
Vocalization, The latter v;as attributed to inadequate audio pick-up i / 
the sound system used in the one-way r.Irror and closed-circuit TV 
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conditions. Frequencies of Vocalization v;ere lower for those media. 
In the present study, subject reactivity to the videotape cameras on the 
part of the Craig family was controlled by the matching of FP and FA 
tapes on rates of deviant behavior, mentioned earlier. 

Recommendat ions for the Control of Observer Bias 

(1) Uninformed observers without culturally determined expectations 
regarding experimental outcome appears to be the best control against 
observer bias. If well trained and monitored for observer accuracy, 
observer sensitivity to the behaviors of interest should be unaffected. 
Where it is impractical to keep an ent;re observer staff uninformed, it 
may be possible to periodically add to the staff ^ne uninformed observer 
freshly trained and anchored to standard coae definitions. Such a 
"calibrating observer" can be used as a standard acainst v;hich observer 
agreement and objectivity car be assessed ivhile p'-ovlding an incentive 
for the informed observers to avoid bias (Skindrud, 1972). 

(2) If the probl em being investigated is obvious and expected out- 
comes known as in evaluations of child behavior therapy, placebo control 
groups can be used (Campbell & Stanley, 19^6; V/alter & Gilmore, 1972). 
Observers can be kept uninformed regarding group membership controlling 
expectancy effects. 

(3) Where convincing placebo groups are impractical, observers can 
be assigned to each subject on a rotating basis so that no observer 
collects data on the same family or classroom more than once. It then 
becomes difficult for observers to infer the treatment status within 
subjects. Potentially confounding expectancies can be assessed by 
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having observers guess the treatment status following each observation 
as Johnson and Bolstad (1972) have done. If observer "guesstimates" are 
unrelated to actual treatment stctus, the investigator may conclude his 
design is unconfounded by observer bias. 

(4) Where impossible to keep observers uninformed of treatment status 
videotape recordings during baseline end treatment conditions may be 
necessary. Observers can be kept totally uninformed of treatment condi- 
tions by removal of all identifying data and randomizing the order in 
which the tapes are observed and coded. 

(5) Where the above alternatives are impractical, it is recommended 
that every effort be made to use specific, behavioral ly defined codes 
with well trained observers anchored to standard code definitions through 
out data collection. O'Leary and Kent (1972) and Johnson and Bolstad 
(1972) provide a number of practical SL'ggestions for the control of ob- 
server drift and the maintenance of observer accuracy. Presumably, 
observers v/ho are "locked in" to well defined code categories will be 
less susceptible to experimenter expectancy effects unless they inten- 
tionally distort the data as in the Azrin et_ aj[. (I965) study. The 
latter appears to be a major problem only where undergraduate observers 
are used (Rosenthal, 19^6, Chapter 3). 
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FOOTNOTES 

1. The work for the present investigation and reported herein was 
performed pursuant to Contract OEC-X-72-0001 (057) with the United States 
Department of Health, Education and Welfare, Office of Education, 
Karlton Skindrud, project director. The pilot work and project director 
were supported by United States Public Health Service (National Institute 
of Mental Health, Center for Studies of Crime and Delinquency) Grant 

#HH 15985-02 to Gerald R. Patterson. 

2. The work sample of the observation task consisted of two short 

videotape segments of the alpha-numeric code presented at different rates. 

Applicants were asked to copy the alpha-numeric (e.g., "1 PN - A CR") 

from the TV monitor as accurately as they could onto a protocol sheet. 

They heard a tone every 30 seconds cuing them to move down one line on 

\ 

thei r sheet . 

In the first work sample, the alpha-numeric code was presented at the 
rate normally coded in the field. The second work sample presented the 
interactions at twice normal rate. The two videotape segments lasted 
five and 2 1/2 minutes, respectively. The first was presented to orient 
the naive applicant to the observer task--most could manage it without 
difficulty. The second was presented rapidly to screen out those who 
might encounter difficulty coordinating observing a;.d rapid handwriting. 
Mean observer accuracy across the applicants during the second work 
sample was .73 with a range from .kk to .95. 
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3. A separate study was designed to develop a sele::tion battery 
which would reliably predict observer accuracy. A multiple regression 
analysis of the four predictor variables employed in the present study 
resulted in three significant predictors of observer accuracy at the end 
of a three-week training program--the Employee Aptitude Survey Numerical 
Reasoning subtest (+.^2) , the work sample of the observation task (+.A0), 
and the Bendig short form of the Taylor Manifest Anxiety Scale (+.30). 
The multiple correlation of these three predictors with the criterion 
was +.61. Analysis of the residuals suggests that these measures are 
best at selecting out appl icants who find such a complex observation task 
difficult, particularly where the applicant is handicapped by low hand- 
writing speed. Such 3 significant multiple correlation is surprising in 
view of the truncated distribution of applicants. Selection had elimi- 
nated the lower one-third of the original distribution. 

Two observer training methods were compared in a pilot study prior 
to the present invest igat ion--a "trial by fire" method where observers 
were expected to code complex family interaction throughout, and a 
"shaping" method where observers began by coding the behavior of simply 
one_ family member and, when proficient, moved on to coding simple and 
then complex interaction . Evaluation of both training methods or. a 
common criterion suggested no differences at the end of the three-week 
training program. However, the "shaping" method resulted in fewer com- 
plaints from the trainees and faster progress for those trainees with 
less aptitude for the observation task. Consequently, the "shaping" 
method was selected for training observers in the present study. 
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5. Johnson and Bolstad (1372) were the first to make this distinction 
between observer accuracy and agreement clear. 

6. A significant relationship between the simplicity of social inter- 
action and observer accuracy was noted in both the pilot and present 
studies. Simplicity of interaction was measured by the percent of social 
interactions repeated consecutively within each fiv:!-minute segment of 
videotape coded. Correlations between simplicity and observer accuracy 
across 12 test tapes of the Craig family with a mean of 26% and a rarge 
from 11 to 36% repeated interactions in the pilot study was +.53- The 
same correlations across 11 test tapes of the Ross family in the present 

study was +.65- ~ 

This relationship explains the different standards for observer 
accuracy in observations of videotape recordings and observer agreement 
in field operations-70% and 8S%, respectively, with the Patterson et_ 
aj_. (1969) and related family interaction codes. The difference between 
the standards for field and videotape observations is attributed to the 
greater complexity obtained in criterion codings of videotapes. Field 
observations do not permit the coding of behavior in as much detail as 
do repeated re-codings of videotape. This difference in simplicity is 
reflected in the mean number of consecutive repeated interactions re- 
ported for criterion codings of videotapes (26%) noted in the represen- 
tative sampling of videotape protocols above and (hM) from a represen- 
tative sampling of field observations by paired observers from the staff 
of the Social Learning Project. A representative sampling of field 
observations by solo observers not monitored for observer agreement re- 
sulted in even higher simplicity ratings {kit). 
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The conc1u:;»Jcn v/as that a criterion of 70^ observer accuracy during 
coding of videotaped family interaction was equivalent to a criterion of 
85% observer agreement in the field. This assumption is supported by a 
subsequent study with the same observers coding relatively simple family 
interaction {S0% repeated interactions) in the field. Under field con- 
ditions they obtained 83% agreement (White, 1972), 

7. In the pilot study coding of the FP and FA tapes the mean rate 
of deviant behaviors for all family members on the FA tapes increased 
60% over the rate on the FP tapes. FP and FA days were alternated 
during the five days of videotaping in the home of the Craig family to 
control for subject reactivity. These results are in essential agree- 
ment with some preliminary field observations under FP and FA conditions 
which suggested that the father's absence influences child management in 
"normal" but not "deviant" families, 

8, The mean rate for the 12 deviant behavior codes reported by the 
28 trainees during their coding of the 50 minutes of the Craig family 
baseline tapes was 6,1 with a range from 3-2 to 9,1 deviant behaviors 
per five minutes across the 28 trainees. It was to control these 
rather large individual differences in observers that the matching 
procedure with random assignment to groups was used, 

9* The possibility existed that there are individual differences 
in response to an expectancy manipulation (Rosenthal, 1966, Chapter 13, 
p, 218), Consequently, those 13 observers judged in agreement with the 
experimenter's predictions were hypothesized to be the most susceptible 
to observer bias. Visual inspection of the rates of deviant behavior 
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reported by these 13 observers (N_ = ^ Increase, k Control, and 5 Decrease 
observers) revealed a minimal trend for observer bias among the Decrease 
observers. However, an analysis of variance with repeated measures 
showed no significant interactions, failing to support the hypothesis 
of bias for these observers. 

10. Rates of deviant behavior on the baseline tapes v/ere not matched 
with rates on the FP and FA tapes explaining the "main effect" that 
would result in a 3 x 3 ANO\/A for treatment conditions. 
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APPENDIX A 

Experimental revision for 
FP-FA Study 
October 1971 

MANUAL FOR CODING OF FAMILY INTERACTIONS 
Adapted from the Manual by 
G, R, Patterson, R, S, Ray, D, A, Shaw, and J. A, Cobb 
Oregon Research Institute and University of Oregon 

The behavioral coding system described in this manual is designed 
to provide an accurate running account of social interaction among 
family members. The behavioral codes are intended to cover all events 
that occur in a household. By weeding out old categories and adding 
new ones, it is now feasible to code each behavior that occurs in a 
home under one of the 29 categories presented in this manual. Three 
main considerations have determined the current status of the coding 
system: very few behavioral categories should be used in order to 
develop a flexible code that would be relatively easy to learn; the 
behavioral categories should be distinct from one another; and the beha- 
vioral categories should require very little inference on the part of 
the observer, i ,e, , that the behavior be observable. The success of the 
system will be shown in the ease with which new observers learn the 
code and the ease with which the code apnl ies to new families as they 
join the project. 
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The manual Is divided into four main sections. The first will state 
some of the general rules regarding observation procedures, the second 
will provide a definition for each behavioral category, the thi-rd will 
provide more specific rules and use of specialized symbols that are 
applicable during an observation, and the fourth section will describe 
verbally a typical family situation and give a complete coded sheet that 
accurately records the behaviors of the family members in the situation. 

General Ru les 

At any given time one family me'^iber is designated as the subject of 
the observer's attention. During that period of time **he subject's beha- 
vior is coded alternately with other family members with whom the subject 
interacts. The simplest way to learn to code alternately is to observe 
the subject, code his behavior, look at the reactions to that behavior 
from other family members, code those behaviors, then look at the subject 
and begin the process again. Often the family member other than the 
subject may be coded first as the behavior of the family me-nber precedes 
the subject's behavior, not only in time, but also "raturally." For 
example, if the son is the subject and the father tells hint to empty the 
garbage, the command of the father would precede whatever behavior the 
son did in relationship to the command. Therefore, there is no hard rule 
that the subject behavior must precede the other family member's behavior. 

A sequence of behavior is arbitrarily defined as one interaction 
-between the subject and one or more family members. A sequence is made 
up of two parts, one involving the subject, and the other involving tli^i 
other fami ly* member or members. A part consists of one or more units. 
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A unit is defined as the identifying number for the person plus the 
behavioral code or codes. To illustrate the sequence and its component 
parts, note the diagram. 

SEQUENCE 

PART 1 PART 11 

/ \^ 

SUBJECT UNIT OTHER FAMILY OTHER FAMILY 

MEMBER UNIT MEMBER UNIT 

As stated previously, the subject unit need not precede the other family 
member unit in a sequence. The symbol Z is used when more than one family 
member is involved in a part of a sequence. Family members are given 
numbers from 1-8; 1 is reserved for the deviant child; 2 is for the father; 
3 is for the mother; k is for the oldest child; 5 is for the next oldest 
child, etc. The number 9 is reserved for those cases in which three or 
rrure family members are doing a similar behavior; in that case, inrtead 
of listing each family member, the number 9 is used. The number 0 is 
reserved for the designation of the observer; in some cases a subject will 
direct behavior to the observer. The subject's behavior must be coded as 
well as the behavior of the observer, so the 0 is used to differentiate 
the observer from the family members. -TuV codes consist of two capita- 
lized letters that are mnemonic devices for the behaviors whicn are to be 
coded. An example of a subject and his behavior, which is one unit as 
well as one part of a sequence, is ILA, which states that the deviant 
child laughed. A sequence >.hich involves two parts, i.e., the subject's 
behavior and the other family member's behavior, could be ILA 2AP. Trans- 
lated, this means the deviant child laughed and the father approved of the 
laughter. Sometimes it is necessary to code more than one behavior to 
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describe a unit; this procedure is acceptable up to two behaviors. Thus, 
iLA PP means that the deviant child laughed and physically touched another 
person in a positive manner, e.g., a hug or a kiss. Likewise, it is 
permissible to code two behaviors for other family members: ILA PP 2AP 
PP Indicates that after the deviant child had laughed and given some 
positive physical contact, the father had approved and also had given 
some positive physical contact. Thus, for each pe.rson coded, it is 
necessary to identify him or her by number and give one behavior that he 
or she was doing; two is the maximum number of behaviors that can be 
attributed to each person in a sequence. Also two is the maximum number 
of persons that can be individual ly identified v/ithin a part of a sequence. 
ILA PP 2AP PP Z 3LA PP means that not only did father approve and pro- 
vide positive physical contact to the son, but mother laughed and approved 
as well. The example given is the longest sequence that is to be used 
with the behavioral code. If mere than two persons are involved in part 
of a sequence, use the number 9 code j_f_ they are doing the same kind of 
behavior. In the example cited, if the mother and father as well as one 
of the other children all smiled approvingly after the child laughed, 
then the sequence is ILA SAP. When more than tv;o persons are involved 
in a part of r sequence and they are exhibiting different behaviors, the 
observer must choose the most relevant behavior^ to code. This will be 
explained in a later section. One additional rule in coding a sequence 
is that the subject is always coded separately, i.e., he is never coded 
with another individual as other family members are. For cxanple, 
when 1 is the subject, ILA 2AP Z 3AP is permissible, but ILA Z ZAP 
SAP is incorrect. 
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From a sequence the observer will build up longer streams of behaviors. 
Usually for each member of the family. 10 minutes of data will be collected 
at each observation. Two '^cn-consecutive five-minute periods of observa- 
tion will be collected for each family member. The five-minute segnients 
are divided into 30-SP^ond intervals, each 30 seconds representing one 
line of data on the Behavior Coding Sheet. In each 30-second interval 
the observer is expected to average five sequences. The number of se- 
quences will be determined by the behaviors of the family as v;ell as the 
speed of the observer. 

Behavioral Codes 

This section is divided into two main sections, First Order Behaviors 
and Secon*^ Drder B'^haviors. The reason for the division into two sections 
is for the observer to have a knov;l edge of priorities in coding behaviors. 
It is impossible to code every behavior emitted, and many times a person 
will emit t^ree or four of the behaviors listed in the raanual . In order 
to resolve the problem and keep the number of behaviors attributable to 
one individual dov;n to two per sequence, some behaviors are designated 
Seconc Order Behaviors, which means that they are never coded when a 
Fi St Order Behavior can be coded. It is up to the discretion of the 
observer v/hat behaviors to choose among several behaviors v/ithin the 
same order. Since the observer can code only tvAD, she must pick those 
behaviors tha* best describe the social interaction that is occurring. 

Not only have behaviors been divied on a priority basis, but also 
on whether they are verbal, non-verbal, or a combi ation.^ This is to 
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aid the observer In cataloguing the codes, and perhaps learning them 
with greater ease. Behaviors are listed alphabetically within each 
subarea. 

First Order Verbal Behaviors 

CM (command): This category is used when a direct, reasonable, and 
clearly-stated request or command is made to another person. The state- 
ment must be sufficiently specific as to indicate clearly the behavior 
v/hich is expected from the person to v-rfiom the command is directed. The 
command neea not require immediate compliance, e.g., father tells the 
son that he has to mow the lawn cn Saturday. Hov/ever, the observer Is 
always to indicate v/hether the comrand is complied v;ith. In the example 
cited, the son could indicate verbally that he is or is not going to 
comply with the father's request. In those instances v/here the compliance 
will not follov/ directly, but is likely to occur before the observer is 
finished coding on the subject's observation sheet, the immediate re- 
sponse should be coded and v/hen compliance or non-compliance occurs, 
that should be coded. For example, mother tells the child, who is the 
subject, to v/ash h*s hands before coning to dinner. The child tells his 
mother that he will and continues v;hatever he was doing, but in a minute 
he goes to the sink and washes his hands. The response to the mother's 
command v/ould be the child's talking and compliance would be coded when 
he began v/ashing his hands. Note that many Questions are Tost appropria- 
tely coded as talk (TA) rather than CM. For example, "What s for dinner?" 
or *'What time Is it?" would be coded TA, while "Would you go into the 
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living room and tell your father that dinner is ready?" or "Will you help 
me lift this table?" would be coded as CM. 

CN (COMMAND NEGATIVE): This is a command which is very different in 
"attitude" from the reasonable command or request described above. This 
kind of commanc has some of the following characteristics: (1) Immediate 
compliance Is demanded. (2) Aversive consequences are impl icitly or 
actually threatened if compliance is not immediate. (3) *t is a kind of 
sarcasm or humiliation directed to the receiver. An example of the im- 
plicit use of aversive consequences is indicated by the tone of voice as 
v/ell as the statement; Mother tells Johnny to shut the door in a normal 
tone of voice; he does not comply; she then raises her voice and says, 
"You'd better shut that door, young man." He shuts the door, "'"he se- 
quence would be coded 3CM INC 3CN ICO. 

CR (cry): Use this category whenever a person cries. There are no 
except ions . 

HU (humiliate): This category should be used when the agent makes 
fun of, shames, or embarrasses the subject intentionally. Examples: 
laugh'ng in a derisive manner at the subject when he attempts to tie his 
shoe; telling the subject in a firm tone of voice, "Boy, you are really 
stupid"; when the subject is playing d game and someone says quite 
strongly, "You are a cheater." The observer must be careful to differen- 
tiate between playful verbal staternents or nicknames and humiliate, e.g., 
some people call each other "stupiJ" but more in terms of endearment than 
of hum; 1 i at ion. Fl^e tone of voice , as v/el 1 as the l anguage used should 
be considered by the observer before a decision is made to code HU or 
some other appropriate code. 
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LA (laugh): Used whenever a person laughs in a non-humiliating way. 
For example, j person tells a joke and the other people laugh at the 
joke. However, if one of the people who heard the joke laughed in a dero- 
gatory manner at he person for the way he told the joke, that would be 
coded as HU and not LA. 

WH (WHINE): Use this category when a person states something in a 
slurring, nasal, high-pitched, falsetto voice. The content of the state- 
ment can be of an approving, disapproving, or neutral quality; the main 
element is the voice quality. 

YE (yell): This category is to be used whenever the person shouts, 
yells, or talks loudly. The sound must be intense enough that if carried 
on for a sufficient time, it v/ould be extremely unple=^sant. 

Non-Verbal Behaviors of the Fi rst Order 

vDESTRUCTIV'f.NESS) : Use of this category is applicable to those 
behaviors by which the person destroys, damages, or attempts to damage 
any object; attacks on people are covered by PN. The damage need not 

actually occur, but the potential for damage must exist, e.g.. the 
child starts to throw a glass, but is stopped by the father. The value 
of the object is of no consideration nor is the actual damage done. 

HR (HIGH RATE): This category is applicable to any behavior not 
covered by other categories thot if carried on for a long period of time 
would be aversive, e.g., running back and forth in t-hp, living room, 
jumping up ^-^nd down on the floor, "rough housing." If the behaviors can 
be covered by other categories, e.g., YE, PN, DS, * h'^.n HR is not to be 
used. It may happen that in a sequence of behaviors HR will be coded 
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intermittently with more specific behaviors, e.g., the children are 
playing leap-Trog in the house and at times one of them gives oct with 
a scream; the coding would be the following: IHR hHR lYE kHR ]HR 
kHR lYE hY^,, iHPx AHR, etc. 

PN (PHYSICAL NEGATIVE): Used whenever a subject physically attacks 
or attempts to attack another person. The attack must be of sufficient 
intensity to potentially inflict pai n, e.g., biting, kicking, slapping, 
hitting, spanking, and taking an object rougly from another person. The 
circumstances surrounding the act need not concern the observer, only 
the potential of inflicting pain. For example, children may be playing 
and part Ct the play involves wrestling. If duiing the wrestling one 
child hits the other child or pins him down to the point where pain 
could result, then the act of hitting or pir ing down should be coded PN, 

PP (PHYSICAL POSITIVE): Use this category whenever a person touches 
another parson in a friendly or affectionate manner, e.g., hut, pat, 
kiss, arm arourd shoulders, holding hands, ruffling hair, etc. 

Firs t Order Behaviors that may be_ Verbal or Non-Verbal 

AP (APPI^DVAL) : U^ed whenever a person gives a clear gestural or 
verbal approval to another pei'son. Approval is more than attentson, in 
that approval must include some clear ^ndicaLion of positive interest or 
involvement. Examples of approval are smiles, head nods, phrases such 
as, "That's a good boy," "Thank you," and "That's right." 

CO (COMPL' ANCE) : Use this category when a person does what is aske 
of him in a CM, CN, or DP. Remember, compliance need not follow the 



77 

previously mentioned behaviors immediately, as other behavioral sequences 
can intervene: 3CM ICR 3PP ir.O. 

DI (disapproval): Use this category whenever the person gives verbal 
or gestural disapproval of another person's behavior or characteristics. 
Shaking the head or finger are examples of gestural disapproval, "I do 
not like that dress/' ''You didn't pick up your clothes again this morning, 
"You're eating too fast," are examples of verbal disapproval. In verbal 
statements it is essential that the content of the statement expl ici tly 
states disapproval of tue subject's behaviors or attributes, e,g,, looks, 
clothes, possessions, etc, DI can be coded simultaneously with CM, but 
never with ON, as CN always implies disapproval. An example of DI and CM 
being coded together is when father says tc the child, "Put on a shirt 
before you come to the dinner table, I don't like you wearing T-shirts 
to dinner." 

DP (DEPENDENCY): Behavior is coded DP when Person A is requesting 
assistance in doing a task that he is capable of doing himself, ror 
example, n>other is reading the newspaper in the evening and a child who 
is in junior Mgh school requests her to look up a v^ord in the dic*-ionary; 
or a child, age lO, asks his mother to tie his shoes. Everyday requests 
should not be coded as DP; t-hey must meet two criteria: that the person 
is capable doing the act himself and it is an imposition on the other 
person to fulfill the request. For example, asking someone to pass the 
nev/spaper which is very close to the individual to who the request is 
directed would not be considered DP, since the person would be able to 
hand the newspaper to the other individual without any undue amount of 
effort. If the paper v.'ere across the room from where the per^^on is t:o 
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whom the request has been made, arrd the person would have to move to get 
the paper, thus unduly interrupting whatever he were doing, then the re- 
quest is coded DP. 

NC (NON-COMPLIANCE): This code is used when a person does not do what 
is requested of him by CM, CN or DP. The non-compliance can ue of a 
verbal cr non-verbal nature. If the request is not to be complied with 
until some later time and the pe. ^on says he will not comply, then the 
appropriate code is NC. Care must be taken to distinguish Dl from NC . 
For example, mother tells dau'^^ter to do the dishes; daughter says that 
mother is always making her work; daughter goes to the sink and begins 
to do the dishes; the proper ceding is 3CM 4D! CO. 

TE (TEASC) : Use this category when a person is teashing another 
person in such a way that the other person is likely to show displeasure 
and disapproval or when the person being teased is trying to do some 
other behavior, but- is unable to because of the teasing. For example, a 
child is trying to do homework and another child keeps tickling him 'n 
the ribs or turns the pages of the book that the child is using for 
studying. Another example would be tv;o parents teas'ng a young child by 
saying, "You're not my boy; go away from me," and when tKs child goes to 
the ot'ier parent, he hears the same remarks. This category should be 
d'st inguished from PL, LA, HU , and PN. Many cases of teasing will fall 
intc the PL category. 

Verbal Behaviors of the Second Order 



The following are lists of behaviors that should be considered Ly 
the observer as secondary in coding. If it Is possible to code behaviors 
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using the First Order behaviors, the Second Order codes should not be 
employed. 

TA (talk): This code is used if none of the other verbal codes are 
appl icable. 

Non-Verbal Second Order Codes 

AT (attention); This category is to be used when one person listens 
to or looks at another person, and the categories AP or Dl are not appro- 
priate. Sometimes when listening is used as a reason for coding AT, it 
May be difficult to tell if the person is listening. The situation will 
generally resolve the question, as the person who has been "listening^' 
may make some comment and the content of the commenL will indicate that 
he has been listening. 

NR (no response); This category is to be used when a person does not 
respond to another person. This category is applicable when a behavior 
does not require a response, or when behavior is directed at another 
person, but the person to whom the behavior is directed failes to per- 
ceive the behavior.. There is a clear differentiation between NR and IG. 
!G is Intentional non-responding and NR may be accidental, e.g., there 
cou^d be a great deal of noise in the house so the person cannot hear the 
behavior to which a response is expected, or the person may be attending 
to something else in the environment, e.g., mother may be feeding the 
baby v;hen an older child comes in and asks a question. Whenever behavior 
is specifically directed toward another person and the person does not 
respond it is necessary to code either NR or IG, 
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Specific Rules and Specialized Symbols 

This section will cover the specifics that are involved in observing 
family interaction.. The necessity of spelling out rules in detail is to 
ensure that every observer does exactly the same procedure as another^ 
observer so that data are comparable from one observer to another observer. 
Another reason for specifying rules is to make the work of the keypuncher 
easier; if all observers use the same symbols and follow the same tech- 
nique, a keypuncher is able to read any sheet in the same manner and key- 
punch faster and more efficiently. 

One observeition sheet is used for each family member for each five- 
minute segment. From the observation sheet the data are then punched 
onto cards so the data can be entered into the computer and analyzed. At 
the top of the Behavior Coding Sheet are several blanks to be filled in 
regarding the family and the observation. The "Family Number" is a 
numerical nunber given to each family on the basis of their entry into 
the research project in comparison to other families that hav^ already 
been in the project. The ID number is the number rhat is pur.shed In IBM 
cards for computer purposes. It consists of 10 digits. The ''Phase" is 
another blank for the observer to fill. h. the phase th3 numbers can be 
1-5 inclusive. The numbers mean the following: 

1. Regular baseline (meaning the family is seen for 10 consecutive 
week days) 

2. Split baselin (the family is observed for five consecutive week 
days, a week or more intervenes, and the family is observed for 
an additional five dayo) 

3. intervention (the family is being seen by a therapist) 

'4. Follow-up (the family has already been through intervention and 
is now on thei r owr) 
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5. A control condition 

*n the "subject" blank the observer puts down the appropriate number 

of the person who is the current focus of observation. In the "observer" 

blank the observer puts her initials; in the "Date" blank, the numbers 

for the current month, day, and year. In the "No" blank, the observer 

records the number of the sheet she has used for the subject for that 

particular observation, e.g., if it is the first five minutes that the 

subject is being observed, the number "1" is placed in the blank. Most 

of the preceding information is then coded into a 10 it number v/hich 

is placed in the "ID Number" section. Thi *MD Number" must always come 

out to be nine digits; so, if space is available for tv/o digits and only 

one digit is use' the other space, always to the lert, is supplied with 

a "0." Following is a list of the spaces and the information to be 

fillsd in by use cF numbers: 

1-2 spaces - Family Number 

3 - Subject Number 

h - Phase Number 

5, 6, 7, 8 - Month, Day, Year 

9 - Sheet Number 

For the month and day four spaces are available so the observer mus^ be 

careful to place zeros in the appropriate places if they are needed. For 

example, the date is March ^th. The four spaces reserved foi month and 

day woul d be 030^. 

Following the blanks the obiierver has the codes listed alphabetically 

with -^he appropriate symbol. Then the main body of the sheet begi^'*. with 

a line split into five segments. Each segment is to contain a sequence 

of behavior. The five segments should not be considered as constricting; 

the observer should cods rrxjre than five sequences if niore than five occur. 
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Each line represents 30 seconds of data. And with 10 lines, five minutes 
of data are collected on each Behavior Coding Sheet. During the data 
collection there are several events that" cen occ-jr; and, because of these 
events, a series of symbols besides the behavioral categories has been 
devised to record in a simple fashion what those events are. If the 
observer takes a break wh i I e cod i ng a line, then the point at whicli tie 
break occured is coded with the letter "U," If the subject takes a break 
while he is being coded wi thin a line, the obsefvi - writes the letter "K" 
at that point where the subject took a break. Examples of a subject 
taking a break are leaving the room during an observation or beinc out 
of view of the observer for one reason or another. If the observer 
break occurs at the end of a line, then the letter '*B" is used; if the 
subject break occurs at the end of a line, t-he letter "A" is used. 

The symbols "A, B, K," and "U" are to be circuled on the Behavioral 
Coding Sheet so that they are- clearly discernible from the behavioral 
codes. Sequences that are repeated on a line need not be coded using 
nunber and letters over and over again. Instead, simply put a dash and 
a slash, i.e., "-/." Never use these symbols at the beginning of a line 
even though the sequence is the same ls those in the preceding lines. 
Always wr'te out the first sequence of a line anH then the use of the 
dash and slash is appropriate. 

The observer is aware of each 30-second period because a timing 
device in the clipbojrd which is used to hold the Behavior Coding Sheets 
gives off auditory signals through an earplug ever/ 30 seconds. A re- 
set button on the outer casing of che liming device should be pushed at 
the beginning of a five-minute segment and at each time when the observer 



picks up from a break t-hat has Seen taken at the end of a line. When a 
break has occurred in the middle of a line, the' observer resumes coding 
on the same line after the break is over, until there are a total of five 
sequences on the line; at that point, the observer is to push the reset 
button and go on to the ne.xt line. 

In coding the observer must have compK^te sequences, i.e., an ic'enti- 
fying number for the subject, the subject's behavior, an identifying 
number for another person ond his behavior. Parts of sequences are 
meaningless in this coding system. Add i t i onc^ 1 1 y , the observer should 
attempt to begin each line of coding with a behavior of his current sub- 
ject, if possible. 

Sometimes during coding a subject, the observer will find that the 
subject is not interacting with other family membei's. Thus, the code 
that is proper for the response of the other family menbers to the subject 
is NR. There is a temptation to code the behavior of the other persons 
in the room; however, there is the danger that to do so woi»ld provide a 
non-random sample of the behavior of these other persons. NR should be 
used unless it is clear that the subject being observed is actually 
interacting with (or attending to) other persons. 

Finally, at the end of the coding sheet lines are provided for the 
observer to record the situation that vjas going on while the subject was 
being observed. The observer should write ter-^ statements or simply 
one-word descriptions of what was occurring, e.g., dinner, working in 
the kitchen, reading a newspaper, etc. When more than one descriptive 
statement is used, the observer should put in the line number or numbrrs 
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appropriate to each statement • Also, any event that occurred that was 
difficult to code should be included so the observer can obtain clarifi- 
cation on how the action should be coded. 

Sample Observation 



To Illustrate the use of the behav ra 1 code, a fictional description 
of a family is provided. It is unlikely that all the events and behaviors 
that are described would occur in five minutes. In order to illustrate 
examples of all the rules and the use of every behavior, the time element 
has been speeded up considerably, e.g., dinner takes only one minute. 
The deviant child is Kevin, age 9; he is the family member who is being 
observed and is coded as number !• His sister, Freida, age 8, is number 
A. Mother is number 3 and father is number 2. The observation takes 
place on March 6, I969 during baseline, and the family is the fiftieth 
family accepted inco the research project. The sheet to be coded is the 
first sheet for Kevin this night. 
(j3 Kevin is in the living room playing alone with his soldiers, ^ather 
is also in the living room and he is reading the newspaper. No interac- 
tion is going on between them. The pattern persists for ar^croximately 
20 seconds, at which time mother calls into the living room and says in 
i conversational ton(K of voice to father, "How was your day today?" 
Fath'or answers, *'lt was fine, although it was a little hot driving home 
from the office tonight," Since Kevin does not attend to tliis interac- 
tion and is not involved in this interaction and since the parents are 
obviously not provicing a consequence for Kevin's playing, this is coded 
IPL 2NR Z 3NP.. 
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(5) Kevin Is still playing with the soldiers and Freida enters from the 
kitchen sa/in.g, "OK if I play with you?" Kevin says, "No, I don't like 
the way you play." Freida replies, "Oh, you're such a stupid idiot; you 
don't know if I v/ere playing right or wrong." Father says, "Kevin, let 
Freida play with you." (evin says, "No." Father says in a stern tone 
of voice, "Kevin, if you know what's good for you, you'll let your sister 
play with you." Kevin replies in a negative tone of voice, "OK." Kevin 
and Freida play soldiers. 

(3) They continue to play soldiers for another 20 seconds Freida picks 
up one of the soldiers and pulls an arm off it. Kevin hitr her. Father 
tells Kevin to leave the room. Kevin obeys. 

(V) A fev/ minutes later Kevin goes into the dining room where the family 
is having dinner. The family eats in silence for 30 seconds. 

(5) Kevin asks his mother, "Mom, will you rrash my potatoes?" Mother does 
Kevin thanks her and father says, "Kevin, I think you are old enough to 
mash your o'wn potatoes nov;." The observer takes a break. Freida tells 

a joke that she had heard in school. Kevin laughs. Kevin tickles Freida 
Freida says in a high-pitched, sing-songy voice, "Kevin is picking on me 
again." Kevin stops and eats. 

(6) Kevin goes to father's chair and stands alongside it. Father puts 
his arm around Kevin's shoulders. Kevin says to mother as Freida looks 
at Kevin, "Can I go out and play after supper?" Mother does not reply. 
Kevin raises his voice and repeats the question. Mother says, "You don't 
have to yell; I can hear you." Father says, "Hov/ many times have I told 
you not to yell at your mother?" Kevin scratches a bruise on his arm 
while mother tells Freida to get started on the dishes, v/h:ch Frieda does 
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Kevin continues to rub and scratch his arm while mother and daughter are 
working at the kitchen sink. 

(7) Mother tells Kevin to empty the rubbish. Kevin takes the garbage out 
of the house. He returns crying, while nvother and Freida look at him. 

He says, "A bee stung me." As mother says, "Where did it bite you?", 
Freida and father watch Kevin. The father says, "What a big sissy, crying 
over a little bee sting." Kevin replies, "You're so mean, you don't kncv/ 
how much it hurts." Mother hands Kevin a tube of salve. The observer 
takes a break. 

(8) Kevin is doing his homework on the living room floor while Freida, 
mother, and father engage in o conversation. This continues fo'' about 
20 seconds. Mother gets up from the knitting she has been doing, and 
comes over to Kevin and says, "You look so hot, dear, let me take your 
sv/eater off for you." Kevin allows her to take off the sweater. She 
then says. "I'll do these arithmetic problems for you," and takes the 
pencil out of his hand and v;orks the problems for Kevin while he looks on. 

(9) She continues for a few seconds to do his problems for him. Kevin 
says, "I think I know v/hat the ansv/er to the next problem is." Mother 
doesn't seem to hear. Kevin gets up from his homev/ork and begins 
running around the room. Mother and father say loudly to him, "Why do 
you have to make so much noise?" Father then says, "If you don't stop 
this nonsense immediately, you're not going on the picnic Saturday." 
Kevin stops. Kevin returns to his homev/ork, while mother and father are 
discussing plans for the picnic. 

(To) Kevin says to Freida, "Mcmd me that book on the table, .vi 1 1 you please. 
She does. Kevin thanks he**. Freida says in an amused tone of voice. 
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'^You're just sooo polite.'* Kevin continues to work while father reads 
the newspaper, mother knits, and Freida listens to the phonograph* Then 
the other three family members begin a discussion of the picnic and Kevin 
looks at them as they talk. He returns to work and Freida watches him. 
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Family Number v50 



SUBJECT 



ID Number // G^O L / 

BEHAVIOR CODING SHEET 
OBSERVE R >^ 



DATE 



PHASE_ 
NO 



Behavior Codes 



DP Dependency 



A? Approval 

AT Attention 

CM CoTiiirand 

CN Commond (negative) KU Humiliate 



NO Normative 



DS Destructiveness NR No Response 
HR " High Rate 



CO Coifipliance 
CR Cry 

DI Disapproval 



tA Laugh 

NC Non-conipliance 



TA Talk 

TE Tease 

PN Negative Physi- wH *>hine 

cal Contact YE Yell 

PP Positive Physi- 
cal Contact 



1 




1 


\ 










/ 


2 








1 

^C.t4 ICO 


/wo -^AiO 


3 


t ^0 ^ L^fJO 






DS - IPN 




^ 










/ 










5 










1 


6 








/NO "9/^/^ 


7 


3C/>7 - ICO d 




ITA - 




JA/O- /NO 


8 






— -1 




3r/l- /AT 


9 




/rf\ "SNfl 








10 








1 AT^ ^A/^ 


1 NO ^ If AT 



Description ^,fiy^^^_Jjj/^^C._.Zi/A?/y.€?2=_^^ 
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APPENDIX B ' 

FP-FA Study 
November 1971 

03SERVER ASSISTANT INVENTORY 

Date Group (circle one): 9*30 

11 :00 
1:00 

Check the point on each of the following scales which reflects your 
feelings or understanding of this research project right now . 

For me, this observation task Is: 



J ! ! I I ! > 

12 3^567 
Easy Very diff icul t 

I assisting with the data collection for this research project. 



J ! ! I I I I 

12 3^567 
L i ke D i s 1 i ke 

I feel the work of an observer isf 

J I I 1 1 I I 

12 3^567 
Challenging Boring 

The research assistant supervising this project expects of us: 

J 1 I I I ' « 

1 2 3 ^ 5 ' 6 7 

Too much Too little 
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This research project is: 



1 2 3 A 5 6 7 

Interesting Uni nterest i ng 



I feel that the data were are collecting in this research project is: 

J 1 I ! ! I I 

1 2 3 4 '-^ 6 7 

Important Un important 

The experimenter's prediction for the set of FA tapes seen by this group 
of observers was that the rate of deviant behavior in the family would: 

J 1 I I I I I 

+75^ +50% +25% 0% -15% -50% -15% 

Increase or have Decrease 

unknown effects 



If I were in charge of this research project I would run it: 



1 2 

The same way 



7 

Differently 



My suggestions for changing or improving the procedures used in this or 
future research projects of this nature are: 
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APPENDIX C 



OBSERVER ASSISTANT QUESTIONNAIRE 
December ] 971 



1. In several sentences give your understanding of what this study was 
about . 



2. What were you told about the history of the Craig family? 



3. As far as you can, indicate the variable beinq manipulated, the 
specific variables (behaviors) of interest to the investigator 
and the investigator's prediction about the variables being 
measured by your group's observations of the videotapes. 



What evidence or arguments were presented by the investigator to 
support the prediction given your group? 
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5. Did you have any personal expectations regarding the c itcome of the 
study? If so, what were they and were you more motivated to see 
the investigator's prediction or your own confirmed by the results 
of this study? 



S. Why were the observers told about the study? 



7. If you were the investigator, would you have conducted this study 
any differently? How would you have conducted it? Why? 



ERiC 



93 



OBSERVER ASSISTANT QUESTIONNAIRE - 2 

At any time di'ring this pilot study (where you were coding FP and FA 
videotapes), were you suspicious of the rationale given you by the 
investigator? That is, did you question whether you were being told 
the truth and that the investigator was studying what he described? 

a. If yes, at what point did you become suspicious and why? 



b. If yes, what do you think the investigator was really studying? 



APPENDIX D 

Memo to: Judges 
From: Karl Skindrud 

Re: Instructions for Sorting the Observer Assistant Questionnaires 

I am asking you to take 30 minutes to read through ihe questionnaire 
completed by a group of 27 subjects. You will be sorting the question- 
naires into three groups, as explained below. 

The subjects were divided into three groups which were given diffe- 
rent instruct ior-, regarding a study in which they v/ere participating. 
The instruct iot.s given each of the groups can be summarized as follows: 
Control group : "You will be observing two sets of videotapes of family 
interact ion--one set with the father present and another of the same 
family with the father absent. We hope to determine the effect of the 
father's absence upon family interaction." 

The essential components that differentiate the instructions given 
the control group from at least one of the other groups include: 

1) no specific information given about family status 

2) no predictions were made 

3) no specific behaviors were given special importance 
A) no relevant studies v/ere quoted or theories expounded 

Increase group : "You will be observing two sets of videotapes of family 
interact ion--one set with the father present and another of the same 
family with the father absent. The parents of the family were concerned 
about the behavior of their youngest boy, Craig, particularly his 
whining, yelling, and high rate behaviors* The tapes you will be 
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observing were made at various stages of a treatment program. In the 
sets of tapes your group will be observing, we are predicting that 
during the father's absence an increas e I n the rat e of deviant behaviors 
wi ] ] be observed . By "rate of deviant behaviors*' we mean the rate with 
which the CN, CR, Dl, DS, DP, HU, HR, NC , PN, TE, WH, and YE codes are 
observed. We are quite certain that this prediction will be confirmed 
in this study because other studies have shown these trends as a result 
of the father's absence and we have theoretical reasons for predicting 
an increase. We are conducting this large-scale study to confirm these 
trends." 

The essential components that differentiate the instructions given 
the increase group from at least one of the other groups include: 

1) given information about the status of the family observed (about 

o enter a treatment program) 

2) given the experimenter's prediction regarding an increase in 
deviant behaviors during the father's absence 

3) the specific does predicted to change v/ere listed 

k) the experimenter's prediction was supported by early data returns 
from other studies and theory. 
Decrease group : "You will be observing two sets of videotapes of family 
interaction — one set with the father present and another of the same 
family with the father absent. The parents of the family were concerned 
about the behavior of their youngest boy, Craig, particularly his 
v/hining, yelling, and high rate behaviors. The tapes you will be ob- 
serving were made at various stages of a treatment program. In the sets 
of tapes your group will be observing, we are predicting that during 



the father ' s absence a_ decrease in the rate of dev lant behavio rs will be 
obsf rved . By "rate of deviant behaviors" mean the rate with whicli 
the CN, CR, Dl, DS, DP, HU, HR, NC, PN, TE, WH, and YE codes are observed. 
We are quite certain that this p jdiction will be confirmed in this study 
because other studies have shown these trends as a result of the fath»-r's 
absence • Jer the same treatment conditions . We also have theoretical 
reasons for predictir.g a decrease. We are conducting thi? large-scale 
study to confirm these trends." 

The essential components that differentiate the instructions given 
the decrease group from at least one of the other groups include: 

1) given information about the status of the family observed (in 
treatment) 

2) given th" experimenter's prediction regarding a decrease in 
deviant behaviors during the father's absence 

3) the specific codes predicted to charge were listed 

k) the experimenter's prediction was supported by early data 
returns from other studies and theory. 
Instructions tc the sorters : Your task is to read each one of the ques- 
tionnaires and to sort them into three piles according to the informa- 
tion given you. If sorted correctly, you should finish with the lollov/ing 
distribution: 

Control Grou p Increase G roup Decrease Group 

5 quest ionna i res 11 questionnaires 11 questionnaires 

i f yotJ do not finish with this distribution , re-sor^ borderl i ne cases 
until you achieve the above distribution . (Please note that some subjects 
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saw the father-absent tapes fol lowing and some before the father-present 
tapes. Consequently, the "increases'* and "decreases" described should 
always be relattve to the father- absent condition.) Please place each 
pile in one of the labeled envelopes provided as appropriate. Return to 
Karl Skindrud for recording of your sortings as soon as possible. Thank 
you for your hel p. 
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Memo to: Judges 
From: Karl Skind rud 

Re: Instructions for Sorting Questionnaire Responses to item 5 

Please take 10 minutes to read each observer's response to item 5 of 
the Observer Assistant Inventory. I am interested in your judgement as 
to the observer's personal expectat ions of change in rate of deviant 
behavior from the FP to the FA condition. Did the observer expect an 
increase, no change, a decrease, or admit no personal expectations for 
the Craig family FA tapes? 

Sort the attached 27 questionnaires into three piles according to 
item 5 responses as follov/s: 

Increase No Change Decrease 

or 

No Persona 1 Expectat ions 
If you find responses v/hich only indicate agreement or disagreement v/ith 
the experimenter's prediction, place them in a separate pile with your 
judgement attached and I wi 1 1 assign the response to one of the above 
piles according to actual group membership. 
Thank you for your assistance. 



